Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech

Pronunciation Error Analysis
DOI: 10.21437/interspeech.2021-38 Publication Date: 2021-08-27T05:59:39Z
ABSTRACT
We propose a weakly-supervised model for word-level mispronunciation detection in non-native (L2) English speech. To train this model, phonetically transcribed L2 speech is not required and we only need to mark mispronounced words. The lack of phonetic transcriptions means that the has learn from weak signal mispronunciations. Because due limited amount speech, more likely overfit. limit risk, it multi-task setup. In first task, estimate probabilities mispronunciation. For second use phoneme recognizer trained on L1 easily accessible can be automatically annotated. Compared state-of-the-art approaches, improve accuracy detecting pronunciation errors AUC metric by 30% GUT Isle Corpus Polish speakers, 21.5% German Italian speakers
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (10)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....