NFDI4DS | UHH-SEMS - Publication Details

Speech Translation with Large Language Models: An Industrial Practice

Benchmark (surveying) Speech translation

DOI: 10.48550/arxiv.2312.13585 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (7)

Zhichao Huang

Rong Ye

Tom Ko

Qianqian Dong

Shanbo Cheng

Mingxuan Wang

Hang Li

ABSTRACT

Given the great success of large language models (LLMs) across various tasks, in this paper, we introduce LLM-ST, a novel and effective speech translation model constructed upon pre-trained LLM. By integrating (LLM) with encoder employing multi-task instruction tuning, LLM-ST can produce accurate timestamped transcriptions translations, even from long audio inputs. Furthermore, our findings indicate that implementation Chain-of-Thought (CoT) prompting yield advantages context LLM-ST. Through rigorous experimentation on English Chinese datasets, showcase exceptional performance establishing new benchmark field translation. Demo: https://speechtranslation.github.io/llm-st/.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications

PlumX Metrics

Speech Translation with Large Language Models: An Industrial Practice

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....