NFDI4DS | UHH-SEMS - Publication Details

Conferencingspeech Challenge: Towards Far-Field Multi-Channel Speech Enhancement for Video Conferencing

OPENALEX - Publications

Wei Rao Yihui Fu Yanxin Hu Xin Xu Yvkai Jv and 9 more

The ConferencingSpeech 2021 challenge is proposed to stimulate research on far-field multi-channel speech enhancement for video conferencing. consists of two separate tasks: 1) Task 1 with single microphone array and focusing practical application real-time requirement 2) 2 multiple distributed micro-phone arrays, which a non-real-time track does not have any constraints so that participants could explore algorithms obtain high quality. Targeting the real conferencing room application,...

10.1109/asru51503.2021.9688126 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2021-12-13

Speaker verification using attentive multi-scale convolutional recurrent network

OPENALEX - Publications

Yanxiong Li Zhongjie Jiang Wenchang Cao Qisheng Huang

10.1016/j.asoc.2022.109291 article EN Applied Soft Computing 2022-07-11

Speaker Clustering by Co-Optimizing Deep Representation Learning and Cluster Estimation

OPENALEX - Publications

Yanxiong Li Wucheng Wang Mingle Liu Zhongjie Jiang Qianhua He

Speaker clustering is a task to merge speech segments uttered by the same speaker into single cluster, which an effective tool for alleviating management of massive amount audio documents. In this paper, we present work co-optimizing two main steps clustering, namely, feature learning and cluster estimation. our method, deep representation learned convolutional autoencoder network (DCAN), while estimation realized softmax layer that combined with DCAN. We devise integrated loss function...

10.1109/tmm.2020.3024667 article EN IEEE Transactions on Multimedia 2020-09-21

INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing

OPENALEX - Publications

Wei Rao Yihui Fu Yanxin Hu Xin Xu Yvkai Jv and 9 more

The ConferencingSpeech 2021 challenge is proposed to stimulate research on far-field multi-channel speech enhancement for video conferencing. consists of two separate tasks: 1) Task 1 with single microphone array and focusing practical application real-time requirement 2) 2 multiple distributed arrays, which a non-real-time track does not have any constraints so that participants could explore algorithms obtain high quality. Targeting the real conferencing room application, database was...

10.48550/arxiv.2104.00960 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Lightweight Speaker Verification Using Transformation Module With Feature Partition and Fusion

OPENALEX - Publications

Yanxiong Li Zhongjie Jiang Qisheng Huang Wenchang Cao Jialong Li

Although many efforts have been made on decreasing the model complexity for speaker verification, it is still challenging to deploy verification systems with satisfactory result low-resource terminals. We design a transformation module that performs feature partition and fusion implement lightweight verification. The consists of multiple simple but effective operations, such as convolution, pooling, mean, concatenation, normalization, element-wise summation. It works in plug-and-play way,...

10.1109/taslp.2023.3338533 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2023-12-05

Speaker verification using attentive multi-scale convolutional recurrent network

OPENALEX - Publications

Yanxiong Li Zhongjie Jiang Wenchang Cao Qisheng Huang

In this paper, we propose a speaker verification method by an Attentive Multi-scale Convolutional Recurrent Network (AMCRN). The proposed AMCRN can acquire both local spatial information and global sequential from the input speech recordings. method, logarithm Mel spectrum is extracted each recording then fed to for learning embedding. Afterwards, learned embedding back-end classifier (such as cosine similarity metric) scoring in testing stage. compared with state-of-the-art methods...

10.48550/arxiv.2306.00426 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

Lightweight Speaker Verification Using Transformation Module with Feature Partition and Fusion

OPENALEX - Publications

Yanxiong Li Zhongjie Jiang Qisheng Huang Wenchang Cao Jialong Li

Although many efforts have been made on decreasing the model complexity for speaker verification, it is still challenging to deploy verification systems with satisfactory result low-resource terminals. We design a transformation module that performs feature partition and fusion implement lightweight verification. The consists of multiple simple but effective operations, such as convolution, pooling, mean, concatenation, normalization, element-wise summation. It works in plug-and-play way,...

10.48550/arxiv.2312.03324 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01