“You don’t understand me!”: Comparing ASR Results for L1 and L2 Speakers of Swedish
FOS: Computer and information sciences
Other Engineering and Technologies
Sound (cs.SD)
Computer Science - Computation and Language
automatic speech recognition
02 engineering and technology
language learning
Computer Science - Sound
non-native speech
Audio and Speech Processing (eess.AS)
FOS: Electrical engineering, electronic engineering, information engineering
0202 electrical engineering, electronic engineering, information engineering
Annan teknik
Computation and Language (cs.CL)
Electrical Engineering and Systems Science - Audio and Speech Processing
DOI:
10.21437/interspeech.2021-2140
Publication Date:
2021-08-27T05:59:39Z
AUTHORS (4)
ABSTRACT
The performance of Automatic Speech Recognition (ASR) systems has constantly increased in state-of-the-art development. However, performance tends to decrease considerably in more challenging conditions (e.g., background noise, multiple speaker social conversations) and with more atypical speakers (e.g., children, non-native speakers or people with speech disorders), which signifies that general improvements do not necessarily transfer to applications that rely on ASR, e.g., educational software for younger students or language learners. In this study, we focus on the gap in performance between recognition results for native and non-native, read and spontaneous, Swedish utterances transcribed by different ASR services. We compare the recognition results using Word Error Rate and analyze the linguistic factors that may generate the observed transcription errors.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (11)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....