speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment

Pronunciation
DOI: 10.21437/interspeech.2021-1259 Publication Date: 2021-08-27T05:59:39Z
ABSTRACT
This paper introduces a new open-source speech corpus named "speechocean762" designed for pronunciation assessment use, consisting of 5000 English utterances from 250 non-native speakers, where half the speakers are children.Five experts annotated each at sentence-level, wordlevel and phoneme-level.A baseline system is released in open source to illustrate phoneme-level workflow on this corpus.This allowed be used freely commercial non-commercial purposes.It available free download OpenSLR, corresponding published Kaldi recognition toolkit.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (42)