- Speech and Audio Processing
- Advanced Adaptive Filtering Techniques
- Speech Recognition and Synthesis
- Music and Audio Processing
- Hearing Loss and Rehabilitation
- Advanced Battery Materials and Technologies
- Advancements in Battery Materials
- Indoor and Outdoor Localization Technologies
- Supercapacitor Materials and Fabrication
- Structural Health Monitoring Techniques
- Advanced Battery Technologies Research
- Analytical chemistry methods development
- Higher Education and Teaching Methods
- Acoustic Wave Phenomena Research
- Infant Health and Development
- Analytical Chemistry and Sensors
- Electrochemical Analysis and Applications
- Emotion and Mood Recognition
Chinese Academy of Sciences
2021-2024
University of Chinese Academy of Sciences
2021-2024
Beijing National Laboratory for Molecular Sciences
2023-2024
Institute of Acoustics
2021-2023
Tencent (China)
2023
Harbin Institute of Technology
2023
Dalian University of Technology
2023
Xi'an Aeronautical University
2006
For challenging acoustic scenarios as low signal-to-noise ratios, current speech enhancement systems usually suffer from performance bottleneck in extracting the target mixtures within one step. To address this issue, we propose a novel complex spectral mapping approach with two-stage pipeline for monaural time-frequency domain. The proposed algorithm aims to decouple primal problem into multiple sub-problems, which follows classic proverb, "two heads are better than one". More specifically,...
Layered transition metal oxide cathodes have been one of the dominant for lithium-ion batteries with efficient Li+ intercalation chemistry. However, limited by weak layered interaction and unstable surface, mechanical chemical failure plagues their electrochemical performance, especially Ni-rich cathodes. Here, adopting a simultaneous elemental-structural atomic arrangement control based on intrinsic Ni-Co-Mn system, surface role is intensively investigated. Within invariant oxygen...
Background noise and room reverberation are regarded as two major factors to degrade the subjective speech quality.In this paper, we propose an integrated framework address simultaneous denoising dereverberation under complicated scenario environments.It adopts a chain optimization strategy designs four sub-stages accordingly.In first stages, decouple multi-task learning w.r.t.complex spectrum into magnitude phase, only implement removal in domain.Based on estimated priors above, further...
Ni-rich cathodes are some of the most promising candidates for advanced lithium-ion batteries, but their available capacities have been stagnant due to intrinsic Li
It remains a tough challenge to recover the speech signals contaminated by various noises under real acoustic environments. To this end, we propose novel system for denoising in complicated applications, which is mainly comprised of two pipelines, namely two-stage network and post-processing module. The first pipeline proposed decouple optimization problem w.r.t. magnitude phase, i.e., only estimated stage both them are further refined second stage. aims suppress remaining unnatural...
Standing upon the intersection of traditional beamformers and deep neural networks, we propose a causal beamformer paradigm called Embedding Beamforming, two core modules are devised accordingly, namely EM BM. For EM, instead estimating spatial covariance matrix explicitly, 3-D embedding tensor is learned with network, where spatial-spectral discriminative information can be implicitly represented. BM, network directly leveraged to derive beamforming weights so as implement filter-and-sum...
While deep neural networks have facilitated significant advancements in the field of speech enhancement, most existing methods are developed following either empirical or relatively blind criteria, lacking adequate guidelines pipeline design. Inspired by Taylor's theorem, we propose a general unfolding framework for both single- and multi-channel enhancement tasks. Concretely, formulate complex spectrum recovery into spectral magnitude mapping neighborhood space noisy mixture, which an...
This paper describes the legends-tencent team's real-time General Speech Restoration (Gesper) system submitted to ICASSP 2023 Signal Improvement (SSI) Challenge. newly proposed is a two-stage architecture, in which speech restoration performed, and then followed by enhancement. We propose complex spectral mapping-based generative adversarial network (CSM-GAN) as module for first time. For noise suppression dereverberation, enhancement presented with fullband-wideband parallel processing. On...
It is highly desirable that speech enhancement algorithms can achieve good performance while keeping low latency for many applications, such as digital hearing aids, mobile phones, acoustically transparent devices, and public address systems. To improve the of traditional low-latency algorithms, a deep filter-bank equalizer (FBE) framework was proposed integrated learning-based subband noise reduction network with shortened filter mapping network. In first network, learning model trained...
Most deep-learning-based multi-channel speech enhancement methods focus on designing a set of beamforming coefficients, to directly filter the low signal-to-noise ratio signals received by microphones, which hinders performance these approaches. To handle problems, this paper designs causal neural that fully exploits spectro-temporal-spatial information in beamspace domain. Specifically, multiple beams are designed steer towards all directions, using parameterized super-directive beamformer...
Abstract Layered transition metal oxide cathodes have been one of the dominant for lithium‐ion batteries with efficient Li + intercalation chemistry. However, limited by weak layered interaction and unstable surface, mechanical chemical failure plagues their electrochemical performance, especially Ni‐rich cathodes. Here, adopting a simultaneous elemental‐structural atomic arrangement control based on intrinsic Ni−Co−Mn system, surface role is intensively investigated. Within invariant oxygen...
Due to the high computational complexity model more frequency bands, it is still intractable conduct full-band speech enhancement based on deep neural networks. Recent studies typically utilize compressed perceptually motivated features with relatively low resolution filter spectrum by one-stage networks, leading limited quality improvements. In this paper, we propose a coordinated sub-band fusion network for enhancement, which aims recover low- (0-8kHz), middle- (8-16kHz), and high-band...
It remains a tough challenge to recover the speech signals contaminated by various noises under real acoustic environments. To this end, we propose novel system for denoising in complicated applications, which is mainly comprised of two pipelines, namely two-stage network and post-processing module. The first pipeline proposed decouple optimization problem w:r:t: magnitude phase, i.e., only estimated stage both them are further refined second stage. aims suppress remaining unnatural...
Most deep learning-based multi-channel speech enhancement methods focus on designing a set of beamforming coefficients to directly filter the low signal-to-noise ratio signals received by microphones, which hinders performance these approaches. To handle problems, this paper designs causal neural beam that fully exploits spatial-spectral information in domain. Specifically, multiple beams are designed steer towards all directions using parameterized super-directive beamformer first stage....
Due to the high computational complexity model more frequency bands, it is still intractable conduct real-time full-band speech enhancement based on deep neural networks. Recent studies typically utilize compressed perceptually motivated features with relatively low resolution filter spectrum by one-stage networks, leading limited quality improvements. In this paper, we propose a coordinated sub-band fusion network for enhancement, which aims recover low- (0-8 kHz), middle- (8-16 and...
The spatial covariance matrix has been considered to be significant for beamformers. Standing upon the intersection of traditional beamformers and deep neural networks, we propose a causal beamformer paradigm called Embedding Beamforming, two core modules are designed accordingly, namely EM BM. For EM, instead estimating explicitly, 3-D embedding tensor is learned with network, where both spectral discriminative information can represented. BM, network directly leveraged derive beamforming...