NFDI4DS | UHH-SEMS - Publication Details

Turning Whisper into Real-Time Transcription System

OPENALEX - Publications

Dominik Macháček Raj Dabre Ondřej Bojar

Dominik Macháček, Raj Dabre, Ondřej Bojar. Proceedings of the 13th International Joint Conference on Natural Language Processing and 3rd Asia-Pacific Chapter Association for Computational Linguistics: System Demonstrations. 2023.

10.18653/v1/2023.ijcnlp-demo.3 article EN cc-by 2023-01-01

English-Czech Systems in WMT19: Document-Level Transformer

OPENALEX - Publications

Martin Popel Dominik Macháček Michal Auersperger Ondřej Bojar Pavel Pecina

We describe our NMT systems submitted to the WMT19 shared task in English→Czech news translation. Our are based on Transformer model implemented either Tensor2Tensor (T2T) or Marian framework. aimed at improving adequacy and coherence of translated documents by enlarging context source target. Instead translating each sentence independently, we split document into possibly overlapping multi-sentence segments. In case T2T implementation, this "document-level"-trained system achieves a +0.6...

10.18653/v1/w19-5337 article EN 2019-01-01

ELITR Multilingual Live Subtitling: Demo and Strategy

OPENALEX - Publications

Ondřej Bojar Dominik Macháček Sangeet Sagar Otakar Smrž Jonáš Kratochvíl and 14 more

Ondřej Bojar, Dominik Macháček, Sangeet Sagar, Otakar Smrž, Jonáš Kratochvíl, Peter Polák, Ebrahim Ansari, Mohammad Mahmoudi, Rishu Kumar, Dario Franceschini, Chiara Canton, Ivan Simonini, Thai-Son Nguyen, Felix Schneider, Sebastian Stüker, Alex Waibel, Barry Haddow, Rico Sennrich, Philip Williams. Proceedings of the 16th Conference European Chapter Association for Computational Linguistics: System Demonstrations. 2021.

10.18653/v1/2021.eacl-demos.32 article EN cc-by 2021-01-01

Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed

OPENALEX - Publications

Thuong Hai Pham Dominik Macháček Ondřej Bojar

The utility of linguistic annotation in neural machine translation seemed to had been established past papers. experiments were however limited recurrent sequence-to-sequence architectures and relatively small data settings. We focus on the state-of-the-art Transformer model use comparably larger corpora. Specifically, we try promote knowledge source-side syntax using multi-task learning either through simple manipulation techniques or a dedicated component. In particular, train one...

10.13053/cys-23-3-3265 article EN Computación y Sistemas 2019-10-07

Lost in Interpreting: Speech Translation from Source or Interpreter?

OPENALEX - Publications

Dominik Macháček Matúš Žilinec Ondřej Bojar

10.21437/interspeech.2021-2232 article EN Interspeech 2022 2021-08-27

ELITR Non-Native Speech Translation at IWSLT 2020

OPENALEX - Publications

Dominik Macháček Jonáš Kratochvíl Sangeet Sagar Matúš Žilinec Ondřej Bojar and 4 more

Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao. Proceedings of the 17th International Conference on Spoken Language Translation. 2020.

10.18653/v1/2020.iwslt-1.25 article EN cc-by 2020-01-01

How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?

OPENALEX - Publications

Sara Papi Peter Polák Ondřej Bojar Dominik Macháček

Simultaneous speech-to-text translation (SimulST) translates source-language speech into target-language text concurrently with the speaker's speech, ensuring low latency for better user comprehension. Despite its intended application to unbounded most research has focused on human pre-segmented simplifying task and overlooking significant challenges. This narrow focus, coupled widespread terminological inconsistencies, is limiting applicability of outcomes real-world applications,...

10.48550/arxiv.2412.18495 preprint EN arXiv (Cornell University) 2024-12-24

CUNI Systems for the Unsupervised News Translation Task in WMT 2019

OPENALEX - Publications

Ivana Kvapilíková Dominik Macháček Ondřej Bojar

In this paper we describe the CUNI translation system used for unsupervised news shared task of ACL 2019 Fourth Conference on Machine Translation (WMT19). We follow strategy Artetxe ae at. (2018b), creating a seed phrase-based where phrase table is initialized from cross-lingual embedding mappings trained monolingual data, followed by neural machine synthetic parallel data. The corpus was produced tuned PBMT model refined through iterative back-translation. further focus handling named...

10.18653/v1/w19-5323 article EN cc-by 2019-01-01

MT Metrics Correlate with Human Ratings of Simultaneous Speech Translation

OPENALEX - Publications

Dominik Macháček Ondřej Bojar Raj Dabre

There have been several meta-evaluation studies on the correlation between human ratings and offline machine translation (MT) evaluation metrics such as BLEU, chrF2, BertScore COMET. These used to evaluate simultaneous speech (SST) but their correlations with of SST, which has recently collected Continuous Ratings (CR), are unclear. In this paper, we leverage evaluations candidate systems submitted English-German SST task at IWSLT 2022 conduct an extensive analysis CR aforementioned metrics....

10.18653/v1/2023.iwslt-1.12 article EN cc-by 2023-01-01

Multilingual Ontologies for the Representation and Processing of Folktales

OPENALEX - Publications

Thierry Declerck Anastasija Aman Martin Banzer Dominik Macháček Lisa Schäfer and 1 more

We describe work done in the field of folkloristics and consisting creating ontologies based on well-established studies proposed by "classical" folklorists.This is supporting availability a huge amount digital structured knowledge folktales to humanists.The ontological encoding past current motif-indexation classification systems for was first step limited English language data.This led us focus making those newly generated formal sources available few more languages, like German, Russian...

10.26615/978-954-452-046-5_003 article EN 2017-11-10

Comprehension of Subtitles from Re-Translating Simultaneous Speech Translation

OPENALEX - Publications

Dávid Javorský Dominik Macháček Ondřej Bojar

In simultaneous speech translation, one can vary the size of output window, system latency and sometimes allowed level rewriting. The effect these properties on readability comprehensibility has not been tested with modern neural translation systems. this work, we propose an evaluation method investigate effects comprehension user preferences. It is a pilot study 14 users 2 hours German documentaries or speeches online translations into Czech. We collect continuous feedback answers factual...

10.48550/arxiv.2203.02458 preprint EN cc-by arXiv (Cornell University) 2022-01-01

The Reality of Multi-Lingual Machine Translation

OPENALEX - Publications

Tom Kocmi Dominik Macháček Ondřej Bojar

Our book "The Reality of Multi-Lingual Machine Translation" discusses the benefits and perils using more than two languages in machine translation systems. While focused on particular task sequence-to-sequence processing multi-task learning, targets somewhat beyond area natural language processing. is for us a prime example deep learning applications where human skills capabilities are taken as benchmark that many try to match surpass. We document some gains observed multi-lingual may result...

10.48550/arxiv.2202.12814 preprint EN cc-by arXiv (Cornell University) 2022-01-01

CUNI Neural ASR with Phoneme-Level Intermediate Step for~Non-Native~SLT at IWSLT 2020

OPENALEX - Publications

Peter Polák Sangeet Sagar Dominik Macháček Ondřej Bojar

In this paper, we present our submission to the Non-Native Speech Translation Task for IWSLT 2020. Our main contribution is a proposed speech recognition pipeline that consists of an acoustic model and phoneme-to-grapheme model. As intermediate representation, utilize phonemes. We demonstrate surpasses commercially used automatic (ASR) submit it into ASR track. complement with off-the-shelf MT systems take part also in translation

10.18653/v1/2020.iwslt-1.24 article EN 2020-01-01

Presenting Simultaneous Translation in Limited Space

OPENALEX - Publications

Dominik Macháček Ondřej Bojar

Some methods of automatic simultaneous translation a long-form speech allow revisions outputs, trading accuracy for low latency. Deploying these systems users faces the problem presenting subtitles in limited space, such as two lines on television screen. The must be shown promptly, incrementally, and with adequate time reading. We provide an algorithm subtitling. Furthermore, we propose way how to estimate overall usability combination subtitling by measuring quality, latency, stability...

10.48550/arxiv.2009.09016 preprint EN other-oa arXiv (Cornell University) 2020-01-01

ELITR Non-Native Speech Translation at IWSLT 2020

OPENALEX - Publications

Dominik Macháček Jonáš Kratochvíl Sangeet Sagar Matúš Žilinec Ondřej Bojar and 4 more

This paper is an ELITR system submission for the non-native speech translation task at IWSLT 2020. We describe systems offline ASR, real-time and our cascaded approach to SLT SLT. select primary candidates from a pool of pre-existing systems, develop new end-to-end general ASR system, hybrid trained on speech. The provided small validation set prevents us carrying out complex validation, but we submit all unselected contrastive evaluation test set.

10.48550/arxiv.2006.03331 preprint EN other-oa arXiv (Cornell University) 2020-01-01

English-Czech Systems in WMT19: Document-Level Transformer

OPENALEX - Publications

Martin Popel Dominik Macháček Michal Auersperger Ondřej Bojar Pavel Pecina

We describe our NMT systems submitted to the WMT19 shared task in English-Czech news translation. Our are based on Transformer model implemented either Tensor2Tensor (T2T) or Marian framework. aimed at improving adequacy and coherence of translated documents by enlarging context source target. Instead translating each sentence independently, we split document into possibly overlapping multi-sentence segments. In case T2T implementation, this "document-level"-trained system achieves a $+0.6$...

10.48550/arxiv.1907.12750 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Robustness of Multi-Source MT to Transcription Errors

OPENALEX - Publications

Dominik Macháček Peter Polák Ondřej Bojar Raj Dabre

Automatic speech translation is sensitive to recognition errors, but in a multilingual scenario, the same content may be available various languages via simultaneous interpreting, dubbing or subtitling. In this paper, we hypothesize that leveraging multiple sources will improve quality if complement one another terms of correct information they contain. To end, first show on 10-hour ESIC corpus, ASR errors original English and its interpreting into German Czech are mutually independent. We...

10.48550/arxiv.2305.16894 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Turning Whisper into Real-Time Transcription System

OPENALEX - Publications

Dominik Macháček Raj Dabre Ondřej Bojar

Whisper is one of the recent state-of-the-art multilingual speech recognition and translation models, however, it not designed for real time transcription. In this paper, we build on top create Whisper-Streaming, an implementation real-time transcription Whisper-like models. Whisper-Streaming uses local agreement policy with self-adaptive latency to enable streaming We show that achieves high quality 3.3 seconds unsegmented long-form test set, demonstrate its robustness practical usability...

10.48550/arxiv.2307.14743 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Robustness of Multi-Source MT to Transcription Errors

OPENALEX - Publications

Dominik Macháček Peter Polák Ondřej Bojar Raj Dabre

Automatic speech translation is sensitive to recognition errors, but in a multilingual scenario, the same content may be available various languages via simultaneous interpreting, dubbing or subtitling. In this paper, we hypothesize that leveraging multiple sources will improve quality if complement one another terms of correct information they contain. To end, first show on 10-hour ESIC corpus, ASR errors original English and its interpreting into German Czech are mutually independent. We...

10.18653/v1/2023.findings-acl.228 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01