- Speech and Audio Processing
- Hearing Loss and Rehabilitation
- Advanced Adaptive Filtering Techniques
- Speech Recognition and Synthesis
- RNA Research and Splicing
- Image Processing and 3D Reconstruction
- Music and Audio Processing
- Biometric Identification and Security
- Text Readability and Simplification
- Natural Language Processing Techniques
- Face and Expression Recognition
- EEG and Brain-Computer Interfaces
- 3D Shape Modeling and Analysis
- RNA modifications and cancer
- Context-Aware Activity Recognition Systems
- Topic Modeling
- Noise Effects and Management
- Iterative Learning Control Systems
- Distributed and Parallel Computing Systems
- Network Time Synchronization Technologies
- Chronic Obstructive Pulmonary Disease (COPD) Research
- Human Mobility and Location-Based Analysis
- Flood Risk Assessment and Management
- Metaheuristic Optimization Algorithms Research
- Software System Performance and Reliability
Shenyang Institute of Engineering
2014-2024
Open University of China
2023
The Ohio State University
2016-2022
University of Oklahoma Health Sciences Center
2016-2021
Starkey Hearing Technologies (United States)
2016
North China Institute of Aerospace Engineering
2013
In real-world situations, speech reaching our ears is commonly corrupted by both room reverberation and background noise. These distortions are detrimental to intelligibility quality, also pose a serious problem many speech-related applications, including automatic speaker recognition. order deal with the combined effects of noise reverberation, we propose two-stage strategy enhance speech, where denoising dereverberation conducted sequentially using deep neural networks. addition, design...
In daily listening environments, human speech is often degraded by room reverberation, especially under highly reverberant conditions. Such degradation poses a challenge for many processing systems, where the performance becomes much worse than in anechoic environments. To combat effect of we propose monaural (single-channel) dereverberation algorithm using temporal convolutional networks with self attention. Specifically, proposed system includes self-attention module to produce dynamic...
Human listeners often have difficulties understanding speech in the presence of background noise real world. Recently, supervised learning based enhancement approaches achieved substantial success, and show significant improvements over conventional approaches. However, existing try to minimize mean squared error between enhanced output pre-defined training target (e.g., log power spectrum clean speech), even though purpose such is improve noise. In this paper, we propose a new deep neural...
In the real world, speech is usually distorted by both reverberation and background noise. such conditions, intelligibility degraded substantially, especially for hearing-impaired (HI) listeners. As a consequence, it essential to enhance in noisy reverberant environment. Recently, deep neural networks have been introduced learn spectral mapping corrupted speech, shown significant improvements objective metrics automatic recognition score. However, listening tests not yet any benefit. this...
In daily listening environments, speech is commonly corrupted by room reverberation and background noise. These distortions are detrimental to intelligibility quality, also severely degrade the performance of automatic speaker recognition systems. this paper, we propose a two-stage algorithm deal with confounding effects noise separately, where denoising dereverberation conducted sequentially using deep neural networks. addition, design new objective function that incorporates clean phase...
Recently, deep learning based speech segregation has been shown to improve human intelligibility in noisy environments. However, one important factor not yet considered is room reverberation, which characterizes typical daily The combination of reverberation and background noise can severely degrade for hearing-impaired (HI) listeners. In the current study, a time-frequency masking algorithm was proposed address both noise. Specifically, neural network trained estimate ideal ratio mask,...
Human speech is usually distorted by room reverberation. These corruptions degrade quality and intelligibility, especially under a long reverberation time, they also pose serious problem for many speech-related applications such as automatic recognition. In this paper, we propose supervised dereverberation algorithm that models late using recurrent neural network (RNN) with short-term memory (LSTM). By taking advantage of LSTM's ability to capture history, can be effectively removed the...
Speech separation or enhancement algorithms seldom exploit information about phoneme identities. In this study, we propose a novel phoneme-specific speech method. Rather than training single global model to enhance all the frames, train separate for each process its corresponding frames. A robust ASR system is employed identify identity of frame. This way, from systems and language models can directly influence by selecting use at test stage. addition, have fewer variations do not exhibit...
This article describes an application of the Communities Advancing Resilience Toolkit (CART) Assessment Survey which has been recognized as important community tool to assist communities in their resilience-building efforts. Developed assessing resilience disasters and other adversities, CART survey can be used obtain baseline information about a community, identify relative strengths challenges, re-examine after disaster or post intervention. article, 5 poverty neighborhoods, illustrates...
Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus on single-task (SSR), such as denoising or declipping. However, SSR systems only one task and do not address the general problem. In addition, previous show limited performance some tasks super-resolution. To overcome those limitations, we propose a (GSR) that attempts multiple simultaneously. Furthermore, VoiceFixer, generative framework GSR task. VoiceFixer consists of an analysis stage synthesis...
Shape assembly aims to reassemble parts (or fragments) into a complete object, which is common task in our daily life. Different from the semantic part (e.g., assembling chair's like legs whole chair), geometric bowl fragments bowl) an emerging computer vision and robotics. Instead of information, this focuses on information parts. As both pose space fractured are exceptionally large, shape disentanglement representations beneficial assembly. In paper, we propose leverage SE(3) equivariance...
Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus on a single type of distortion, such as denoising or dereverberation. However, signals can be degraded by several different simultaneously the real world. It is thus important extend models deal with multiple distortions. In this paper, we introduce VoiceFixer, unified framework for high-fidelity restoration. VoiceFixer restores from (e.g., noise, reverberation, and clipping) expand noisy speech) low...
Recently, deep learning based speech segregation has been shown to improve human intelligibility in noisy environments. However, one important factor not yet considered is room reverberation, which characterizes typical daily The combination of reverberation and background noise can severely degrade for hearing-impaired (HI) listeners. In the current study, a time-frequency masking algorithm was proposed address both noise. Specifically, neural network trained estimate ideal ratio mask,...
Shape assembly aims to reassemble parts (or fragments) into a complete object, which is common task in our daily life. Different from the semantic part (e.g., assembling chair's like legs whole chair), geometric bowl fragments bowl) an emerging computer vision and robotics. Instead of information, this focuses on information parts. As both pose space fractured are exceptionally large, shape disentanglement representations beneficial assembly. In paper, we propose leverage SE(3) equivariance...
Conventional multimodal biometrics systems usually do not account for missing data (missing modalities or incomplete score lists) that is commonly encountered in real applications. The presence of biometric can be inconvenient to the client, as system will reject submitted and request another trial. In such cases, robust verification needed. this paper, we present criteria, fusion method performance metrics a verifies client's identity at any condition missing. A novel adaptive SVM...
A high precision network times synchronization algorithm is proposed in this paper based on the theory of machine self-learning. In order to solve problem time accuracy not high, method, port delay considered. According different environment, self-learning introduced study delay. After experimental verification, it showed that can be greatly improved.
This study is motivated by the fact that there are currently no widely used applications available to quantitatively measure a power wheelchair user's mobility, which an important indicator of quality life. To address this issue, we propose approach allows users use their own mobile devices, e.g., smartphone or smartwatch, non-intrusively collect mobility data in daily However, convenience collection brings substantial challenges analysis because patterns associated with maneuvers not as...
Welcome to the WIT Press eLibrary - home of Transactions Wessex Institute collection, providing on-line access papers presented at Institute's prestigious international conferences and from its State-of-the-Art in Science & Engineering publications.
Universal sound separation (USS) is a task to separate arbitrary sounds from an audio mixture. Existing USS systems are capable of separating sources, given few examples the target sources as queries. However, with single system challenging, and robustness not always guaranteed. In this work, we propose prompt tuning (APT), simple yet effective approach enhance existing systems. Specifically, APT improves performance specific through training small number parameters limited samples, while...