CAT: A CTC-CRF Based ASR Toolkit Bridging the Hybrid and the End-to-End Approaches Towards Data Efficiency and Low Latency

End-to-end principle
DOI: 10.21437/interspeech.2020-2732 Publication Date: 2020-10-27T05:22:11Z
ABSTRACT
In this paper, we present a new open source toolkit for speech recognition, named CAT (CTC-CRF based ASR Toolkit).CAT inherits the data-efficiency of hybrid approach and simplicity E2E approach, providing full-fledged implementation CTC-CRFs complete training testing scripts number English Chinese benchmarks.Experiments show obtains state-of-the-art results, which are comparable to fine-tuned models in Kaldi but with much simpler pipeline.Compared existing nonmodularized models, performs better on limited-scale datasets, demonstrating its data efficiency.Furthermore, propose method called contextualized soft forgetting, enables do streaming without accuracy degradation.We hope CAT, especially CTC-CRF framework software, will be broad interest community, can further explored improved.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (14)