NFDI4DS | UHH-SEMS - Publication Details

Vocal tract normalization in speech recognition: Compensating for systematic speaker variability

Vocal tract Normalization Spectrogram Resampling Speaker diarisation Image warping Dynamic Time Warping

DOI: 10.1121/1.411700 Publication Date: 2005-10-14T19:39:58Z

Abstract Supplemental Material References Cited by

AUTHORS (3)

Jordan Cohen

Terri Kamm

Andreas G. Andreou

ABSTRACT

The performance of speech recognition systems is often improved by accounting explicitly for sources variability in the data. In SWITCHBOARD corpus, studied during 1994 CAIP workshop [Frontiers Speech Processing Workshop II, (August 1994)], an attempt was made to compensate systematic due different vocal tract lengths various speakers. method found a maximum probability parameter each speaker which mapped acoustic model mean models taken from homogeneous population. underlying that straight tube, and estimation accomplished warping spectrum linearly over 20% range (actually digitally resampling data), finding aposteriori data given warp. technique produces statistically significant improvements accuracy on transcription task using four systems. best parametrizations were later correlate well with estimates computed manually spectrograms.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (28)

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications CROSSREF - Publications

PlumX Metrics

Vocal tract normalization in speech recognition: Compensating for systematic speaker variability

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....