NFDI4DS | UHH-SEMS - Publication Details

Speaker Embedding Extraction with Phonetic Information

FOS: Computer and information sciences Sound (cs.SD) 03 medical and health sciences Audio and Speech Processing (eess.AS) FOS: Electrical engineering, electronic engineering, information engineering 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology 0305 other medical science Computer Science - Sound Electrical Engineering and Systems Science - Audio and Speech Processing

DOI: 10.21437/interspeech.2018-1226 Publication Date: 2018-08-28T09:55:42Z

Abstract Supplemental Material References Cited by

AUTHORS (4)

Yi Liu

Liang He

Jia Liu

Michael T. Johnson

ABSTRACT

submitted to Interspeech 2018 (accepted) and open-sourced. Please refer to Interspeech for the final version<br/>Speaker embeddings achieve promising results on many speaker verification tasks. Phonetic information, as an important component of speech, is rarely considered in the extraction of speaker embeddings. In this paper, we introduce phonetic information to the speaker embedding extraction based on the x-vector architecture. Two methods using phonetic vectors and multi-task learning are proposed. On the Fisher dataset, our best system outperforms the original x-vector approach by 20% in EER, and by 15%, 15% in minDCF08 and minDCF10, respectively. Experiments conducted on NIST SRE10 further demonstrate the effectiveness of the proposed methods.<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (32)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products

PlumX Metrics

Speaker Embedding Extraction with Phonetic Information

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....