Xiaoyi Qin

ORCID: 0000-0003-2521-8084
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Speech and Audio Processing
  • Music and Audio Processing
  • Natural Language Processing Techniques
  • Sepsis Diagnosis and Treatment
  • Phonetics and Phonology Research
  • Advancements in PLL and VCO Technologies
  • Topic Modeling
  • Glaucoma and retinal disorders
  • Blood transfusion and management
  • Lung Cancer Diagnosis and Treatment
  • Heart Failure Treatment and Management
  • Sinusitis and nasal conditions
  • Hematopoietic Stem Cell Transplantation
  • Lung Cancer Treatments and Mutations
  • Pancreatitis Pathology and Treatment
  • Ear and Head Tumors
  • Advanced Computational Techniques and Applications
  • Renin-Angiotensin System Studies
  • Ophthalmology and Visual Impairment Studies
  • Blind Source Separation Techniques
  • Voice and Speech Disorders
  • Liver Disease Diagnosis and Treatment
  • Ocular Oncology and Treatments
  • Network Time Synchronization Technologies

Southern Medical University
2025

Wuhan University
2022-2024

Duke Kunshan University
2019-2024

Wenzhou Medical University
2010-2023

First Affiliated Hospital of Wenzhou Medical University
2018-2023

Ganzhou People's Hospital
2023

Hubei Institute of Fine Arts
2022

Affiliated Eye Hospital of Wenzhou Medical College
2010-2020

Sun Yat-sen University
2017-2019

Nanjing Medical University
2011-2019

This paper presents a far-field text-dependent speaker verification database named HI-MIA. We aim to meet the data requirement for microphone array based since most of publicly available databases are single channel close-talking and text-independent. The contains recordings 340 people in rooms designed scenario. Recordings captured by multiple arrays located different directions distance high-fidelity microphone. Besides, we propose set end-to-end neural network baseline systems that adopt...

10.1109/icassp40776.2020.9054423 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Recently, the attention mechanism such as squeeze-and-excitation module (SE) and convolutional block (CBAM) has achieved great success in deep learning-based speaker verification system. This paper introduces an alternative effective yet simple one, i.e., (SimAM), for verification. The SimAM is a plug-and-play without extra modal parameters. In addition, we propose noisy label detection method to iteratively filter out data samples with from training data, considering that large-scale...

10.1109/icassp43922.2022.9746294 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Target-speaker voice activity detection is currently a promising approach for speaker diarization in complex acoustic environments. This paper presents novel Sequence-to-Sequence Target-Speaker Voice Activity Detection (Seq2Seq-TSVAD) method that can efficiently address the joint modeling of large-scale speakers and predict high-resolution activities. Experimental results show larger capacity higher output resolution significantly reduce error rate (DER), which achieves new state-of-the-art...

10.1109/icassp49357.2023.10094752 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Background: Corona Virus Disease 2019 (COVID-19) has become a global pandemic.This study established prognostic scoring models based on comorbidities and other clinical information for severe critical patients with COVID-19.Material Methods: We retrospectively collected data from 51 diagnosed as or COVID-19 who were admitted between January 29, 2020, February 18, 2020.The Charlson (CCI), Elixhauser (ECI), age-and smoking-adjusted (ASCCI) (ASECI) comorbidity indices used to evaluate the...

10.7150/ijms.50007 article EN cc-by-nc International Journal of Medical Sciences 2020-01-01

The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020) addresses three different research problems under well-defined conditions: far-field text-dependent speaker verification from single microphone array, textindependent and distributed arrays.All tasks pose a cross-channel challenge to the participants.To simulate real-life scenario, enrollment utterances are recorded close-talk cellphone, while test arrays.In this paper, we describe database, challenge, baseline...

10.21437/interspeech.2020-1249 article EN Interspeech 2022 2020-10-25

Achiral TTHPs except 5k aggregate via N1-inversion-based racemates, and inter-enantiomer cofacial π-stacking abnormally enhances Φ SF values significantly.

10.1039/d4qo02159b article EN Organic Chemistry Frontiers 2025-01-01

Purpose.: To critically evaluate whether the adenosine A2A receptor (A2AR) plays a role in postnatal refractive development mice. Methods.: Custom-built biometric systems specifically designed for mice were used to assess of relative myopia by examining refraction and biometrics A2AR knockout (KO) wild-type (WT) littermates between days (P)28 P56. Ocular dimensions measured customized optical coherence tomography (OCT), state eccentric infrared photorefraction (EIR), corneal radius curvature...

10.1167/iovs.09-3998 article EN Investigative Ophthalmology & Visual Science 2010-05-19

Gastrointestinal acute graft-versus-host disease (GI aGVHD) is a lethal complication following allogeneic hematopoietic stem cell transplantation (HSCT). However, it still very difficult to make diagnosis of GI aGVHD in practice. To date, no consensus plasma biomarker can be used help diagnosis. Here, we attempted identify associated proteins murine model, which aGVHD.We 8-plex isobaric tags for relative and absolute quantitation (8-plex iTRAQ) screen out samples taken from models before...

10.1186/s12950-017-0178-z article EN cc-by Journal of Inflammation 2018-01-05

DukeECE. As the highly overlapped speech exists in dataset, we employ an x-vector-based target-speaker voice activity detection (TS-VAD) to find overlap between speakers. Firstly, separately train a single-channel model for each of 8 channels and fuse results. In addition, also cross-channel self-attention further improve performance, where non-linear spatial correlations different are learned fused. Experimental results on evaluation set show that TS-VAD reduces DER by over 75% from 12.68%...

10.1109/icassp43922.2022.9747019 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Recent studies have highlighted adverse outcomes of fluid overload in critically ill patients. Therefore, its early recognition is essential for the management these patients.Our aim was to propose a deep learning (DL) model using data from noninvasive chest X‑ray (CXR) imaging associated with status.We collected Medical Information Mart Intensive Care IV (MIMIC‑IV, v. 1.0) and MIMIC Chest X‑Ray (v. 2.0.0) databases modeling, our hospital database testing. The extravascular lung water index...

10.20452/pamw.16396 article EN Polskie Archiwum Medycyny Wewnętrznej 2023-01-04

In this paper, we present the system submission for VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20) by DKU-DukeECE team. For track 1, explore various kinds of state-of-the-art front-end extractors with different pooling layers and objective loss functions. 3, employ an iterative framework self-supervised speaker representation learning based on a deep neural network (DNN). 4, investigate whole pipeline diarization, including voice activity detection (VAD), uniform segmentation,...

10.48550/arxiv.2010.12731 preprint EN other-oa arXiv (Cornell University) 2020-01-01

This paper describes a conditional neural network architecture for Mandarin Chinese polyphone disambiguation.The system is composed of bidirectional recurrent component acting as sentence encoder to accumulate the context correlations, followed by prediction that maps polyphonic character embeddings along with conditions corresponding pronunciations.We obtain word-level condition from pre-trained word-to-vector lookup table.One goal disambiguation address homograph problem existing in...

10.21437/interspeech.2019-1235 article EN Interspeech 2022 2019-09-13

Overexpression of Wilms' tumor-1 (WT1) transcription factor facilitates proliferation in acute myeloid leukemia (AML). However, whether WT1 is enriched the leukemia-initiating cells (LICs) and stem (LSCs) self-renewal LSCs remains poorly understood.MLL-AF9-induced murine model was used to evaluate effect knockdown wt1 on ability LSC. RNA sequencing performed WT1-overexpressing select targets. Apoptosis colony formation assays were assess anti-leukemic potential a deubiquitinase inhibitor...

10.1186/s12967-020-02384-y article EN cc-by Journal of Translational Medicine 2020-06-24

With the development of deep learning, automatic speaker verification has made considerable progress over past few years. However, to design a lightweight and robust system with limited computational resources is still challenging problem. Traditionally, symmetrical, indicating that same embedding extraction model applied for both enrollment in inference. In this paper, we come up an innovative asymmetric structure, which takes large-scale ECAPA-TDNN small-scale ECAPA-TDNNLite verification....

10.1109/icassp43922.2022.9746247 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

In this paper, we introduce a large-scale and high-quality audiovisual speaker verification dataset, named VoxBlink. We propose an innovative robust automatic audio-visual data mining pipeline to curate which contains 1.45M utterances from 38K speakers. Due the inherent nature of automated collection, introducing noisy is inevitable. Therefore, also utilize multi-modal purification step generate cleaner version VoxBlink, VoxBlink-clean, comprising 18K identities 1.02M utterances. contrast...

10.1109/icassp48485.2024.10446780 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

The purpose of our study is to analyze the clinical, ultrasonic, microbiologic, and histopathologic characteristics, management, outcomes in a series primary canaliculitis with concretions patients who underwent canaliculotomy curettage.Thirty-six were reviewed for age, sex, location laterality, duration symptoms, clinical ultrasonic signs, result microbiologic culture examination, treatment, outcomes. Main microbiological characteristics canalicular concretions; profiles; treatment...

10.1097/md.0000000000006188 article EN cc-by-nc Medicine 2017-03-01

Nowadays, as more and systems achieve good performance in traditional voice conversion (VC) tasks, people's attention gradually turns to VC tasks under extreme conditions. In this paper, we propose a novel method for zero-shot conversion. We aim obtain intermediate representations speaker-content disentanglement of speech better remove speaker information get pure content information. Accordingly, our proposed framework contains module that removes the from acoustic feature source speaker....

10.1109/icassp43922.2022.9746048 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27
Coming Soon ...