PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
End-to-end principle
Spoken Language
Deliberation
End of history
DOI:
10.48550/arxiv.2406.07823
Publication Date:
2024-06-11
AUTHORS (9)
ABSTRACT
Spoken Language Understanding (SLU) is a critical component of voice assistants; it consists converting speech to semantic parses for task execution. Previous works have explored end-to-end models improve the quality and robustness SLU with Deliberation, however these remained autoregressive, resulting in higher latencies. In this work we introduce PRoDeliberation, novel method leveraging Connectionist Temporal Classification-based decoding strategy as well denoising objective train robust non-autoregressive deliberation models. We show that PRoDeliberation achieves latency reduction parallel (2-10x improvement over autoregressive models) while retaining ability correct Automatic Speech Recognition (ASR) mistranscriptions systems. further design training allows overcome limitations small ASR devices, provide analysis on necessity each system.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....