NFDI4DS | UHH-SEMS - Publication Details

Meng Yu

ORCID: 0000-0002-0031-9156

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5106407019

Research Areas

Speech and Audio Processing
Music and Audio Processing
Speech Recognition and Synthesis
Advanced Adaptive Filtering Techniques
Hearing Loss and Rehabilitation
Acoustic Wave Phenomena Research
X-ray Diffraction in Crystallography
Crystallization and Solubility Studies
Advanced Data Compression Techniques
Indoor and Outdoor Localization Technologies
Optical Wireless Communication Technologies
Distributed systems and fault tolerance
Data Management and Algorithms
Blind Source Separation Techniques
Time Series Analysis and Forecasting
Natural Language Processing Techniques
Ergonomics and Musculoskeletal Disorders
Software System Performance and Reliability
Mechanical Engineering and Vibrations Research
Flame retardant materials and properties
Advanced Data Storage Technologies
Power Systems and Renewable Energy
IoT and Edge/Fog Computing
Ultrasonics and Acoustic Wave Propagation
Music Technology and Sound Studies

Bellevue Hospital Center
2019-2025

Jiaxing University
2024-2025

Tencent (China)
2018-2024

Hokkaido University
2022-2024

Zhejiang Chinese Medical University
2024

Southern Medical University
2022-2024

State Grid Corporation of China (China)
2018-2024

Aerospace Information Research Institute
2024

Chinese Academy of Sciences
2024

Sun Yat-sen University
2024

Time Domain Audio Visual Speech Separation

OPENALEX - Publications

Jian Wu Yong Xu Shi-Xiong Zhang Lianwu Chen Meng Yu and 2 more

Audio-visual multi-modal modeling has been demonstrated to be effective in many speech related tasks, such as recognition and enhancement. This paper introduces a new time-domain audio-visual architecture for target speaker extraction from monaural mixtures. The generalizes the previous TasNet (time-domain separation network) enable learning at meanwhile it extends classical frequency-domain time-domain. main components of proposed include an audio encoder, video encoder that extracts lip...

10.1109/asru46091.2019.9003983 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019-12-01

ADL-MVDR: All Deep Learning MVDR Beamformer for Target Speech Separation

OPENALEX - Publications

Zhuohuang Zhang Yong Xu Meng Yu Shi-Xiong Zhang Lianwu Chen and 1 more

Speech separation algorithms are often used to separate the target speech from other interfering sources. However, purely neural network based systems cause nonlinear distortion that is harmful for automatic recognition (ASR) systems. The conventional mask-based minimum variance distortionless response (MVDR) beamformer can be minimize distortion, but comes with high level of residual noise. Furthermore, matrix operations (e.g., inversion) involved in MVDR solution sometimes numerically...

10.1109/icassp39728.2021.9413594 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

End-to-End Multi-Channel Speech Separation

OPENALEX - Publications

Rongzhi Gu Jian Wu Shi-Xiong Zhang Lianwu Chen Yong Xu and 4 more

The end-to-end approach for single-channel speech separation has been studied recently and shown promising results. This paper extended the previous proposed a new model multi-channel separation. primary contributions of this work include 1) an integrated waveform-in waveform-out system in single neural network architecture. 2) We reformulate traditional short time Fourier transform (STFT) inter-channel phase difference (IPD) as function time-domain convolution with special kernel. 3)...

10.48550/arxiv.1905.06286 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures

OPENALEX - Publications

Jun Wang Jie Chen Dan Su Lianwu Chen Meng Yu and 2 more

Speaker-aware source separation methods are promising workarounds for major difficulties such as arbitrary permutation and unknown number of sources.However, it remains challenging to achieve satisfying performance provided a very short available target speaker utterance (anchor).Here we present novel "deep extractor network" which creates an point the in canonical high dimensional embedding space, pulls together time-frequency bins corresponding speaker.The proposed model is different from...

10.21437/interspeech.2018-1205 preprint EN Interspeech 2022 2018-08-28

Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information

OPENALEX - Publications

Rongzhi Gu Lianwu Chen Shi-Xiong Zhang Jimeng Zheng Yong Xu and 4 more

10.21437/interspeech.2019-2266 article EN Interspeech 2022 2019-09-13

Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition

OPENALEX - Publications

Aswin Shanmugam Subramanian Chao Weng Shinji Watanabe Meng Yu Dong Yu

10.1016/j.csl.2022.101360 article EN Computer Speech & Language 2022-02-10

Quercetin attenuates cisplatin-induced mitochondrial apoptosis via PI3K/Akt mediated inhibition of oxidative stress in pericytes and improves the blood labyrinth barrier permeability

OPENALEX - Publications

Tian-lan Huang Wenjun Jiang Zan Zhou Tian-feng Shi Miao Yu and 4 more

10.1016/j.cbi.2024.110939 article EN Chemico-Biological Interactions 2024-03-13

Preparation of ethyl cellulose microencapsulated ammonium polyphosphate and its application in flame retardant cellulose paper

OPENALEX - Publications

Kexin Liu Yao Li Ling Xu Feng Zhu Yu Zhang and 2 more

10.1016/j.indcrop.2024.118132 article EN Industrial Crops and Products 2024-02-09

Exploring emotional support and engagement in adolescent EFL learning: The mediating role of emotion regulation strategies

OPENALEX - Publications

Yuchi Zhang Yibin Hu Meng Yu

The crucial role of emotion regulation in learning has been well established, but its potential impact on the English as a foreign language (EFL) process remains uncertain. Examining relationship between strategies and EFL engagement, antecedent variables, significant theoretical practical value. This study aims to explored mediating effects (cognitive reappraisal suppression) associations perceived teacher social support, peer support engagement among Chinese adolescents. data were gathered...

10.1177/13621688241266184 article EN Language Teaching Research 2024-07-27

Audio-Visual Speech Separation and Dereverberation With a Two-Stage Multimodal Network

OPENALEX - Publications

Ke Tan Yong Xu Shi-Xiong Zhang Meng Yu Dong Yu

Background noise, interfering speech and room reverberation frequently distort target in real listening environments. In this study, we address joint separation dereverberation, which aims to separate from background reverberation. order tackle fundamentally difficult problem, propose a novel multimodal network that exploits both audio visual signals. The proposed architecture adopts two-stage strategy, where module is employed attenuate noise the first stage dereverberation suppress second...

10.1109/jstsp.2020.2987209 article EN IEEE Journal of Selected Topics in Signal Processing 2020-03-01

Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning

OPENALEX - Publications

Rongzhi Gu Shi-Xiong Zhang Lianwu Chen Yong Xu Meng Yu and 3 more

Hand-crafted spatial features (e.g., inter-channel phase difference, IPD) play a fundamental role in recent deep learning based multi-channel speech separation (MCSS) methods. However, these manually designed are hard to incorporate into the end-to-end optimized MCSS framework. In this work, we propose an integrated architecture for directly from waveforms within architecture, time-domain filters spanning signal channels trained perform adaptive filtering. These implemented by 2d convolution...

10.1109/icassp40776.2020.9053092 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

A Novel Coacervate Embolic Agent for Tumor Chemoembolization

OPENALEX - Publications

Menghui Liu Yang Sun Yitong Zhou Yanlv Chen Meng Yu and 7 more

Transcatheter arterial chemoembolization (TACE) has proven effective in blocking tumor-supplied arteries and delivering localized chemotherapeutic treatment to combat tumors. However, traditional embolic TACE agents exhibit certain limitations, including insufficient drug-loading sustained-release capabilities, non-biodegradability, susceptibility aggregation, unstable mechanical properties. This study introduces a novel approach address these shortcomings by utilizing complex coacervate as...

10.1002/adhm.202304488 article EN Advanced Healthcare Materials 2024-04-08

The Upregulation of IL-1β Induced by Cisplatin Triggers PI3K/AKT/MMP9 Pathway in Pericytes Mediating the Leakage of the Blood Labyrinth Barrier

OPENALEX - Publications

Miao Yu Wenjun Jiang Meng Yu Zan Zhou Min Wang and 1 more

Background: Blood-labyrinth barrier (BLB) damage has been recognized as a key mechanism underlying cisplatin (CDDP)-induced hearing loss.Inflammation within the cochlea, triggered by CDDP, is pathological response.However, relationship between CDDP-induced inflammation and BLB dysfunction remains elusive.Materials Methods: In vivo in vitro models were used to explore inflammatory mechanisms CDDP ototoxicity.C57BL/6J mice treated with IL-1β levels, permeability, thresholds assessed using...

10.2147/jir.s492292 article EN cc-by-nc Journal of Inflammation Research 2025-01-01

Neural Ambisonic Encoding For Multi-Speaker Scenarios Using A Circular Microphone Array

OPENALEX - Publications

Yue Qiao Vinay Kothapally Meng Yu Dong Yu

10.1109/icassp49660.2025.10890048 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr

OPENALEX - Publications

Yong Xu Chao Weng Like Hui Jianming Liu Meng Yu and 2 more

In this paper, we present a joint training framework between the multi-channel beamformer and acoustic model for noise robust automatic speech recognition (ASR). The complex ratio mask (CRM), demonstrated to be more effective than ideal (IRM), is proposed estimate covariance matrix beamformer. Minimum Variance Distortionless Response (MVDR) Generalized Eigenvalue (GEV) are both investigated under CRM-based architecture. We also propose pooling strategy among multiple channels. A long...

10.1109/icassp.2019.8682576 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation

OPENALEX - Publications

Yong Xu Zhuohuang Zhang Meng Yu Shi-Xiong Zhang Dong Yu

Although the conventional mask-based minimum variance distortionless response (MVDR) could reduce non-linear distortion, residual noise level of MVDR separated speech is still high.In this paper, we propose a spatio-temporal recurrent neural network based beamformer (RNN-BF) for target separation.This new beamforming framework directly learns weights from estimated and spatial covariance matrices.Leveraging on temporal modeling capability RNNs, RNN-BF automatically accumulate statistics...

10.21437/interspeech.2021-430 article EN Interspeech 2022 2021-08-27

Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator

OPENALEX - Publications

Anton Ratnarajah Shi-Xiong Zhang Meng Yu Zhenyu Tang Dinesh Manocha and 1 more

We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating responses (RIRs) given acoustic environment. Our FAST-RIR takes rectangular dimensions, listener and speaker positions, reverberation time (T <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">60</inf> ) as inputs generates specular reflections is capable of RIRs input T with an average error 0.02s. evaluate our generated in automatic speech...

10.1109/icassp43922.2022.9747846 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers

OPENALEX - Publications

Soumi Maiti Yushi Ueda Shinji Watanabe Chunlei Zhang Meng Yu and 2 more

In this paper, we present a novel framework that jointly performs three tasks: speaker diarization, speech separation, and counting. Our proposed integrates diarization based on end-to-end neural (EEND) models, counting with encoder-decoder attractors (EDA), separation using Conv-TasNet. addition, propose multiple <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$1 \times 1$</tex> convolutional layer architecture for estimating the masks...

10.1109/slt54892.2023.10022924 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2023-01-09

Neural Spatio-Temporal Beamformer for Target Speech Separation

OPENALEX - Publications

Yong Xu Meng Yu Shi-Xiong Zhang Lianwu Chen Chao Weng and 2 more

Purely neural network (NN) based speech separation and enhancement methods, although can achieve good objective scores, inevitably cause nonlinear distortions that are harmful for the automatic recognition (ASR).On other hand, minimum variance distortionless response (MVDR) beamformer with NN-predicted masks, significantly reduce distortions, has limited noise reduction capability.In this paper, we propose a multi-tap MVDR complex-valued masks enhancement.Compared to state-of-the-art NN-mask...

10.21437/interspeech.2020-1458 article EN Interspeech 2022 2020-10-25

Speaker-Aware Target Speaker Enhancement by Jointly Learning with Speaker Embedding Extraction

OPENALEX - Publications

Xuan Ji Meng Yu Chunlei Zhang Dan Su Tao Yu and 2 more

Deep learning based speech separation approaches have received great interest, among which the recent speaker-aware enhancement methods are promising for solving difficulties such as arbitrary source permutation and unknown number of sources. In this paper, we propose a novel training framework jointly learns speaker-conditioned target speaker extraction model its associated embedding model. The resulting unified directly appropriate improved enhancement. We demonstrate, on our large...

10.1109/icassp40776.2020.9054311 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation

OPENALEX - Publications

Zhuohuang Zhang Yong Xu Meng Yu Shi-Xiong Zhang Lianwu Chen and 2 more

Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful modern automatic recognition (ASR) systems. Minimum variance distortionless response (MVDR) filters adopted remove distortions, however, conventional mask-based MVDR systems still result in relatively high levels of residual noise. Moreover, the matrix inverse involved solution is sometimes numerically...

10.1109/taslp.2021.3129335 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Multi-band PIT and Model Integration for Improved Multi-channel Speech Separation

OPENALEX - Publications

Lianwu Chen Meng Yu Dan Su Dong Yu

The recent exploration of deep learning for supervised speech separation has significantly accelerated the progress on multi-talker problem. Multi-channel extension attracted much research attention due to benefit spatial information in far-field acoustic environments. In this paper, We review most models multi-channel permutation invariant training (PIT), investigate features formed by microphone pairs and their underlying impact issue, present a multi-band architecture effective feature...

10.1109/icassp.2019.8682470 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-16

Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition

OPENALEX - Publications

Bin Liu Shuai Nie Shan Liang Wenju Liu Meng Yu and 3 more

10.21437/interspeech.2019-1242 article EN Interspeech 2022 2019-09-13

Advancing Acoustic Howling Suppression Through Recursive Training of Neural Networks

OPENALEX - Publications

Hao Zhang Yixuan Zhang Meng Yu Dong Yu

In this paper, we introduce a novel training framework designed to comprehensively address the acoustic howling issue by examining its fundamental formation process. This integrates neural network (NN) module into closed-loop system during with signals generated recursively on fly closely mimic streaming process of suppression (AHS). The proposed recursive strategy bridges gap between and real-world inference scenarios, marking departure from previous NN-based methods that typically approach...

10.1109/icassp48485.2024.10447839 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Coming Soon ...