The DKU-DukeECE-Lenovo System for the Diarization Task of the 2021 VoxCeleb Speaker Recognition Challenge
FOS: Computer and information sciences
Sound (cs.SD)
Audio and Speech Processing (eess.AS)
0202 electrical engineering, electronic engineering, information engineering
FOS: Electrical engineering, electronic engineering, information engineering
02 engineering and technology
Computer Science - Sound
Electrical Engineering and Systems Science - Audio and Speech Processing
DOI:
10.48550/arxiv.2109.02002
Publication Date:
2021-01-01
AUTHORS (7)
ABSTRACT
This report describes the submission of the DKU-DukeECE-Lenovo team to the VoxCeleb Speaker Recognition Challenge (VoxSRC) 2021 track 4. Our system including a voice activity detection (VAD) model, a speaker embedding model, two clustering-based speaker diarization systems with different similarity measurements, two different overlapped speech detection (OSD) models, and a target-speaker voice activity detection (TS-VAD) model. Our final submission, consisting of 5 independent systems, achieves a DER of 5.07% on the challenge test set.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....