Improved Noisy Student Training for Automatic Speech Recognition
Training set
DOI:
10.21437/interspeech.2020-1470
Publication Date:
2020-10-27T09:22:11Z
AUTHORS (8)
ABSTRACT
Recently, a semi-supervised learning method known as "noisy student training" has been shown to improve image classification performance of deep networks significantly.Noisy training is an iterative self-training that leverages augmentation network performance.In this work, we adapt and noisy for automatic speech recognition, employing (adaptive) SpecAugment the method.We find effective methods filter, balance augment data generated in between iterations.By doing so, are able obtain word error rates (WERs) 4.2%/8.6%on clean/noisy LibriSpeech test sets by only using clean 100h subset supervised set rest (860h) unlabeled set.Furthermore, achieve WERs 1.7%/3.4% on unlab-60k Libri-Light 960h.We thus upon previous state-of-the-art achieved (4.74%/12.20%)and (1.9%/4.1%).
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (119)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....