Exploring wav2vec 2.0 on Speaker Verification and Language Identification
Identification
Speaker identification
DOI:
10.21437/interspeech.2021-1280
Publication Date:
2021-08-27T05:59:39Z
AUTHORS (4)
ABSTRACT
Wav2vec 2.0 is a recently proposed self-supervised framework for speech representation learning.It follows two-stage training process of pre-training and fine-tuning, performs well in recognition tasks especially ultra-low resource cases.In this work, we attempt to extend the speaker verification language identification.First, use some preliminary experiments indicate that wav2vec can capture information about language.Then demonstrate effectiveness on two respectively.For verification, obtain new state-of-the-art result, Equal Error Rate (EER) 3.61% VoxCeleb1 dataset.For identification, an EER 12.02% 1 second condition 3.47% full-length AP17-OLR dataset.Finally, utilize one model achieve unified modeling by multi-task learning tasks.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (91)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....