- Speech Recognition and Synthesis
- Speech and Audio Processing
- Music and Audio Processing
- Natural Language Processing Techniques
- Sepsis Diagnosis and Treatment
- Phonetics and Phonology Research
- Advancements in PLL and VCO Technologies
- Topic Modeling
- Glaucoma and retinal disorders
- Blood transfusion and management
- Lung Cancer Diagnosis and Treatment
- Heart Failure Treatment and Management
- Sinusitis and nasal conditions
- Hematopoietic Stem Cell Transplantation
- Lung Cancer Treatments and Mutations
- Pancreatitis Pathology and Treatment
- Ear and Head Tumors
- Advanced Computational Techniques and Applications
- Renin-Angiotensin System Studies
- Ophthalmology and Visual Impairment Studies
- Blind Source Separation Techniques
- Voice and Speech Disorders
- Liver Disease Diagnosis and Treatment
- Ocular Oncology and Treatments
- Network Time Synchronization Technologies
Southern Medical University
2025
Wuhan University
2022-2024
Duke Kunshan University
2019-2024
Wenzhou Medical University
2010-2023
First Affiliated Hospital of Wenzhou Medical University
2018-2023
Ganzhou People's Hospital
2023
Hubei Institute of Fine Arts
2022
Affiliated Eye Hospital of Wenzhou Medical College
2010-2020
Sun Yat-sen University
2017-2019
Nanjing Medical University
2011-2019
This paper presents a far-field text-dependent speaker verification database named HI-MIA. We aim to meet the data requirement for microphone array based since most of publicly available databases are single channel close-talking and text-independent. The contains recordings 340 people in rooms designed scenario. Recordings captured by multiple arrays located different directions distance high-fidelity microphone. Besides, we propose set end-to-end neural network baseline systems that adopt...
Recently, the attention mechanism such as squeeze-and-excitation module (SE) and convolutional block (CBAM) has achieved great success in deep learning-based speaker verification system. This paper introduces an alternative effective yet simple one, i.e., (SimAM), for verification. The SimAM is a plug-and-play without extra modal parameters. In addition, we propose noisy label detection method to iteratively filter out data samples with from training data, considering that large-scale...
Target-speaker voice activity detection is currently a promising approach for speaker diarization in complex acoustic environments. This paper presents novel Sequence-to-Sequence Target-Speaker Voice Activity Detection (Seq2Seq-TSVAD) method that can efficiently address the joint modeling of large-scale speakers and predict high-resolution activities. Experimental results show larger capacity higher output resolution significantly reduce error rate (DER), which achieves new state-of-the-art...
Background: Corona Virus Disease 2019 (COVID-19) has become a global pandemic.This study established prognostic scoring models based on comorbidities and other clinical information for severe critical patients with COVID-19.Material Methods: We retrospectively collected data from 51 diagnosed as or COVID-19 who were admitted between January 29, 2020, February 18, 2020.The Charlson (CCI), Elixhauser (ECI), age-and smoking-adjusted (ASCCI) (ASECI) comorbidity indices used to evaluate the...
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020) addresses three different research problems under well-defined conditions: far-field text-dependent speaker verification from single microphone array, textindependent and distributed arrays.All tasks pose a cross-channel challenge to the participants.To simulate real-life scenario, enrollment utterances are recorded close-talk cellphone, while test arrays.In this paper, we describe database, challenge, baseline...
Achiral TTHPs except 5k aggregate via N1-inversion-based racemates, and inter-enantiomer cofacial π-stacking abnormally enhances Φ SF values significantly.
Purpose.: To critically evaluate whether the adenosine A2A receptor (A2AR) plays a role in postnatal refractive development mice. Methods.: Custom-built biometric systems specifically designed for mice were used to assess of relative myopia by examining refraction and biometrics A2AR knockout (KO) wild-type (WT) littermates between days (P)28 P56. Ocular dimensions measured customized optical coherence tomography (OCT), state eccentric infrared photorefraction (EIR), corneal radius curvature...
Gastrointestinal acute graft-versus-host disease (GI aGVHD) is a lethal complication following allogeneic hematopoietic stem cell transplantation (HSCT). However, it still very difficult to make diagnosis of GI aGVHD in practice. To date, no consensus plasma biomarker can be used help diagnosis. Here, we attempted identify associated proteins murine model, which aGVHD.We 8-plex isobaric tags for relative and absolute quantitation (8-plex iTRAQ) screen out samples taken from models before...
DukeECE. As the highly overlapped speech exists in dataset, we employ an x-vector-based target-speaker voice activity detection (TS-VAD) to find overlap between speakers. Firstly, separately train a single-channel model for each of 8 channels and fuse results. In addition, also cross-channel self-attention further improve performance, where non-linear spatial correlations different are learned fused. Experimental results on evaluation set show that TS-VAD reduces DER by over 75% from 12.68%...
Recent studies have highlighted adverse outcomes of fluid overload in critically ill patients. Therefore, its early recognition is essential for the management these patients.Our aim was to propose a deep learning (DL) model using data from noninvasive chest X‑ray (CXR) imaging associated with status.We collected Medical Information Mart Intensive Care IV (MIMIC‑IV, v. 1.0) and MIMIC Chest X‑Ray (v. 2.0.0) databases modeling, our hospital database testing. The extravascular lung water index...
In this paper, we present the system submission for VoxCeleb Speaker Recognition Challenge 2020 (VoxSRC-20) by DKU-DukeECE team. For track 1, explore various kinds of state-of-the-art front-end extractors with different pooling layers and objective loss functions. 3, employ an iterative framework self-supervised speaker representation learning based on a deep neural network (DNN). 4, investigate whole pipeline diarization, including voice activity detection (VAD), uniform segmentation,...
This paper describes a conditional neural network architecture for Mandarin Chinese polyphone disambiguation.The system is composed of bidirectional recurrent component acting as sentence encoder to accumulate the context correlations, followed by prediction that maps polyphonic character embeddings along with conditions corresponding pronunciations.We obtain word-level condition from pre-trained word-to-vector lookup table.One goal disambiguation address homograph problem existing in...
Overexpression of Wilms' tumor-1 (WT1) transcription factor facilitates proliferation in acute myeloid leukemia (AML). However, whether WT1 is enriched the leukemia-initiating cells (LICs) and stem (LSCs) self-renewal LSCs remains poorly understood.MLL-AF9-induced murine model was used to evaluate effect knockdown wt1 on ability LSC. RNA sequencing performed WT1-overexpressing select targets. Apoptosis colony formation assays were assess anti-leukemic potential a deubiquitinase inhibitor...
With the development of deep learning, automatic speaker verification has made considerable progress over past few years. However, to design a lightweight and robust system with limited computational resources is still challenging problem. Traditionally, symmetrical, indicating that same embedding extraction model applied for both enrollment in inference. In this paper, we come up an innovative asymmetric structure, which takes large-scale ECAPA-TDNN small-scale ECAPA-TDNNLite verification....
In this paper, we introduce a large-scale and high-quality audiovisual speaker verification dataset, named VoxBlink. We propose an innovative robust automatic audio-visual data mining pipeline to curate which contains 1.45M utterances from 38K speakers. Due the inherent nature of automated collection, introducing noisy is inevitable. Therefore, also utilize multi-modal purification step generate cleaner version VoxBlink, VoxBlink-clean, comprising 18K identities 1.02M utterances. contrast...
The purpose of our study is to analyze the clinical, ultrasonic, microbiologic, and histopathologic characteristics, management, outcomes in a series primary canaliculitis with concretions patients who underwent canaliculotomy curettage.Thirty-six were reviewed for age, sex, location laterality, duration symptoms, clinical ultrasonic signs, result microbiologic culture examination, treatment, outcomes. Main microbiological characteristics canalicular concretions; profiles; treatment...
Nowadays, as more and systems achieve good performance in traditional voice conversion (VC) tasks, people's attention gradually turns to VC tasks under extreme conditions. In this paper, we propose a novel method for zero-shot conversion. We aim obtain intermediate representations speaker-content disentanglement of speech better remove speaker information get pure content information. Accordingly, our proposed framework contains module that removes the from acoustic feature source speaker....