NFDI4DS | UHH-SEMS - Publication Details

Hong Liu

ORCID: 0000-0003-4524-495X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100410374

Research Areas

Music and Audio Processing
Speech Recognition and Synthesis
Speech and Audio Processing
Radiomics and Machine Learning in Medical Imaging
Distributed and Parallel Computing Systems
Video Analysis and Summarization
Tactile and Sensory Interactions
AI in cancer detection
Cloud Computing and Resource Management
Video Surveillance and Tracking Methods
Natural Language Processing Techniques
Hand Gesture Recognition Systems
Medical Imaging and Analysis
Advanced Neural Network Applications
Digital Games and Media
Peer-to-Peer Network Technologies
Retinal Imaging and Analysis
Advanced Database Systems and Queries
Data Management and Algorithms
Artificial Intelligence in Games
Seismic Waves and Analysis
Brain Tumor Detection and Classification
Retinal Diseases and Treatments
Music Technology and Sound Studies
Sports Analytics and Performance

Chinese Academy of Sciences
2014-2024

Institute of Computing Technology
2009-2024

University of Chinese Academy of Sciences
2024

Institute of Geology and Geophysics
2021

AIROGS: Artificial Intelligence for Robust Glaucoma Screening Challenge

OPENALEX - Publications

Coen de Vente Koenraad A. Vermeer Nicolas Jaccard He Wang Hongyi Sun and 31 more

The early detection of glaucoma is essential in preventing visual impairment. Artificial intelligence (AI) can be used to analyze color fundus photographs (CFPs) a cost-effective manner, making screening more accessible. While AI models for from CFPs have shown promising results laboratory settings, their performance decreases significantly real-world scenarios due the presence out-of-distribution and low-quality images. To address this issue, we propose Intelligence Robust Glaucoma...

10.1109/tmi.2023.3313786 article EN cc-by IEEE Transactions on Medical Imaging 2023-09-15

Benign and malignant diagnosis of spinal tumors based on deep learning and weighted fusion framework on MRI

OPENALEX - Publications

Hong Liu Menglei Jiao Yuan Yuan Hanqiang Ouyang Jianfang Liu and 7 more

The application of deep learning has allowed significant progress in medical imaging. However, few studies have focused on the diagnosis benign and malignant spinal tumors using imaging age information at patient level. This study proposes a multi-model weighted fusion framework (WFF) for based magnetic resonance (MRI) images information.The proposed WFF included tumor detection model, sequence classification statistic module sagittal MRI sequences obtained from 585 patients with (270...

10.1186/s13244-022-01227-2 article EN cc-by Insights into Imaging 2022-05-10

A Deep Learning Method for Denoising Based on a Fast and Flexible Convolutional Neural Network

OPENALEX - Publications

Wenda Li Hong Liu Jian Wang

Seismic data denoising has always been an indispensable step in the seismic exploration workflow. The quality of results directly affects subsequent inversion and migration imaging. In this article, we proposed a fast flexible convolutional neural network (FFCNN) based on DnCNN. contrast to existing DnCNN other artificial intelligence (AI)-based denoisers, FFCNN enjoys several desirable properties: 1) downsampling upscaling operations, which can sensibly reduce runtimes memory requirements...

10.1109/tgrs.2021.3073001 article EN IEEE Transactions on Geoscience and Remote Sensing 2021-04-23

RGB-D joint modelling with scene geometric information for indoor semantic segmentation

OPENALEX - Publications

Hong Liu Wenshan Wu Xiangdong Wang Yueliang Qian

10.1007/s11042-018-6056-8 article EN Multimedia Tools and Applications 2018-05-21

Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection

OPENALEX - Publications

Liwei Lin Xiangdong Wang Hong Liu Yueliang Qian

In this article, a special decision surface for the weakly-supervised sound event detection (SED) and disentangled feature (DF) multi-label problem in polyphonic SED are proposed. We approach as multiple instance learning (MIL) utilize neural network framework with pooling module to solve it. General MIL approaches include two kinds: instance-level embedding-level approaches. present method of generating probabilities embedding level which tend perform better than terms bag-level...

10.1109/taslp.2020.2989575 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2020-01-01

Semi-Supervised Sound Event Detection with Local and Global Consistency Regularization

OPENALEX - Publications

Yiming Li Xiangdong Wang Hong Liu Rui Tao Long Yan and 1 more

Learning meaningful frame-wise features on a partially labeled dataset is crucial to semi-supervised sound event detection. Prior works either maintain consistency frame-level predictions or seek feature-level similarity among neighboring frames, which cannot exploit the potential of unlabeled data. In this work, we design Local and Global Consistency (LGC) regularization scheme enhance model both label- feature-level. The audio CutMix introduced change contextual information clips. Then,...

10.1109/icassp48485.2024.10446386 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Guided Learning for Weakly-Labeled Semi-Supervised Sound Event Detection

OPENALEX - Publications

Liwei Lin Xiangdong Wang Hong Liu Yueliang Qian

We propose a simple but efficient method termed Guided Learning for weakly-labeled semi-supervised sound event detection (SED). There are two sub-targets implied in SED: audio tagging and boundary detection. Instead of designing single model by considering trade-off between the sub-targets, we design teacher aiming at to guide student learn using unlabeled data. The guidance is guaranteed performance gap models. In meantime, liberated from able provide more excellent results. principle such...

10.1109/icassp40776.2020.9053584 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

A Parallel Algorithm for Game Tree Search Using GPGPU

OPENALEX - Publications

Liang Li Hong Liu Hao Wang Taoying Liu Wei Li

Game tree search is a classical problem in the field of game theory and artificial intelligence. Fast algorithm critical for computer games asking real-time responses. In this paper, we focus on how to leverage massive parallelism capabilities GPU accelerate speed algorithms propose concise general parallel GPU. The performance model our presented analyzed theoretically. We implement two real called Connect6 Chess. also use these verify effectiveness efficiency algorithm. Experiments support...

10.1109/tpds.2014.2345054 article EN IEEE Transactions on Parallel and Distributed Systems 2014-07-31

Optical Braille Recognition Based on Semantic Segmentation Network with Auxiliary Learning Strategy

OPENALEX - Publications

Renqiang Li Hong Liu Xiangdong Wang Jianxing Xu Yueliang Qian

Optical Braille Recognition methods usually use many designed steps, such as image deskewing, dots detection, cell grids construction and character recognition, which are less robust for complex scenes. This paper proposes an optimal semantic segmentation framework BraUNet to directly detect recognize characters in the whole original images. adds extra auxiliary learning strategy UNet network, uses long-range connections of feature maps between encoder decoder get more low-level features....

10.1109/cvprw50498.2020.00285 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2020-06-01

Multi-Branch Learning for Weakly-Labeled Sound Event Detection

OPENALEX - Publications

Yuxin Huang Xiangdong Wang Liwei Lin Hong Liu Yueliang Qian

There are two sub-tasks implied in the weakly-supervised SED: audio tagging and event boundary detection. Current methods which combine multi-task learning with SED requires annotations both for these sub-tasks. Since there only available SED, we design multiple branches different purposes instead of pursuing tasks. Similar to tasks, can also prevent common feature share from overfitting any one purposes. We based on combinations MIL strategies pooling methods. Experiments DCASE 2018 Task 4...

10.1109/icassp40776.2020.9053023 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Application-aware Interface for SOAP Communication in Web Services

OPENALEX - Publications

Hao Wang Yizhu Tong Hong Liu Taoying Liu

SOAP protocol has emerged as the Web service communication standard. Because of relatively poor performance, many researchers focus on improving speed processing message. In this paper, we propose SPI, which introduces client usage pattern to low level process infrastructure, in order improve performance some kind services applications with specific patterns. The pack interface SPI is an approach reduce number messages latency side. This optimization technique packs concurrent requests into...

10.1109/clustr.2006.311886 article EN 2006-01-01

A Multimodal Adaptive Cooperative Learning Framework for Cancer Survival Risk Prediction

OPENALEX - Publications

Zekang Yang Hong Liu Xiangdong Wang

Computer-aided cancer survival risk prediction plays an important role in the timely treatment of patients. This is a challenging weakly supervised ordinal regression task associated with multiple clinical factors involved such as pathological images, genomic data and etc. In this paper, we propose new training method, multimodal object-level contrast learning, for prediction. First, construct learning pairs based on relationship among samples sample set. Then introduce method to train...

10.1145/3688868.3689195 article EN cc-by 2024-10-28

Improving speech transcription by exploiting user feedback and word repetition

OPENALEX - Publications

Xiangdong Wang Ying Yang Hong Liu Yueliang Qian

10.1007/s11042-017-4714-x article EN Multimedia Tools and Applications 2017-06-08

Audio-Free Prompt Tuning for Language-Audio Models

OPENALEX - Publications

Yiming Li Xiangdong Wang Hong Liu

Contrastive Language-Audio Pretraining (CLAP) is pre-trained to associate audio features with human language, making it a natural zero-shot classifier recognize unseen sound categories. To adapt CLAP downstream tasks, prior works inevitably require labeled domain audios, which limits their scalability under data scarcity and deprives them of the capability detect novel classes as original CLAP. In this work, by leveraging modality alignment in CLAP, we propose an efficient audio-free prompt...

10.1109/icassp48485.2024.10446472 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Leveraging Language Model Capabilities for Sound Event Detection

OPENALEX - Publications

Hualei Wang Jianguo Mao Zhifang Guo Jiarui Wan Hong Liu and 1 more

Large language models reveal deep comprehension and fluent generation in the field of multi-modality. Although significant advancements have been achieved audio multi-modality, existing methods are rarely leverage model for sound event detection (SED). In this work, we propose an end-to-end framework understanding features while simultaneously generating their temporal location. Specifically, employ pretrained acoustic to capture discriminative across different categories autoregressive text...

10.21437/interspeech.2024-112 article EN Interspeech 2022 2024-09-01

Advancing Multi-grained Alignment for Contrastive Language-Audio Pre-training

OPENALEX - Publications

Yiming Li Zhi‐Fang Guo Xiangdong Wang Hong Liu

Recent advances have been witnessed in audio-language joint learning, such as CLAP, that shows much success multi-modal understanding tasks.These models usually aggregate uni-modal local representations, namely frame or word features, into global ones, on which the contrastive loss is employed to reach coarse-grained cross-modal alignment.However, frame-level correspondence with texts may be ignored, making it ill-posed explainability and fine-grained challenges also undermine performances...

10.1145/3664647.3681145 preprint EN 2024-10-26

BgNet: Classification of benign and malignant tumors with MRI multi-plane attention learning

OPENALEX - Publications

Hong Liu Menglei Jiao Xiaoying Xing Hanqiang Ouyang Yuan Yuan and 8 more

To propose a deep learning-based classification framework, which can carry out patient-level benign and malignant tumors according to the patient's multi-plane images clinical information. A total of 430 cases spinal tumor, including axial sagittal plane by MRI, 297 for training (14072 images), 133 testing (6161 images) were included. Based on bipartite graph attention learning, this study proposed learning BgNet, tumor diagnosis. In structure, area in each is used as vertex graph, matching...

10.3389/fonc.2022.971871 article EN cc-by Frontiers in Oncology 2022-10-27

Language model adaptation based on correction information for interactive speech transcription

OPENALEX - Publications

Jia Duan Xiangdong Wang Yuzhuo Ma Yang Yang Hong Liu and 1 more

Aiming at language model (LM) adaptation for interactive speech transcription, this paper proposes a topic-based method using users' correction information. To infer the topic each utterance in continuous speech, uses information of history utterances adjacent to current one. Perplexity is calculated inference. Topic-related LMs are interpolated with background LM obtain adapted LMs. Each transcribed model. This supervised which believed outperform unsupervised approaches widely used...

10.1109/pic.2016.7949506 article EN 2016-12-01

Speech Synthesis of Chinese Braille with Limited Training Data

OPENALEX - Publications

Jianguo Mao Jingwen Zhu Xiangdong Wang Hong Liu Yueliang Qian

This paper describes to our knowledge the first Chinese Braille speech synthesis system. The system consists of modules front-end processing, prosody prediction, and synthesis. processing includes conversion from common Pinyin, a high-precision character prediction model. To achieve high precision under limited corpus conditions, we propose model based on RoBERTa pre-trained model, which achieves an accuracy 94.42%. Finally, real-time TTS Tacotron2 LPCNet is proposed. We modify Tacotron2,...

10.1109/icme51207.2021.9428160 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2021-06-09

Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection

OPENALEX - Publications

Liwei Lin Xiangdong Wang Hong Liu Yueliang Qian

In this paper, a special decision surface for the weakly-supervised sound event detection (SED) and disentangled feature (DF) multi-label problem in polyphonic SED are proposed. We approach as multiple instance learning (MIL) utilize neural network framework with pooling module to solve it. General MIL approaches include two kinds: instance-level embedding-level approaches. present method of generating probabilities embedding level which tend perform better than terms bag-level...

10.48550/arxiv.1905.10091 preprint EN other-oa arXiv (Cornell University) 2019-01-01

RHJoin: A fast and space-efficient join method for log processing in MapReduce

OPENALEX - Publications

Dixin Tang Taoying Liu Hong Liu Wei Li

Equi-join is heavily used in MapReduce-based log processing. With the rapid growth of dataset sizes, join methods on MapReduce are extensively studied recently. We find that existing usually cannot get high query performance and affordable storage consumption at same time when faced with a huge amount data. They either only optimize one aspect but significantly sacrifice other or have limited applications. In this paper, after analyzing characteristics workloads underlying MapReduce, we...

10.1109/padsw.2014.7097918 article EN 2014-12-01

Coming Soon ...