- Natural Language Processing Techniques
- Topic Modeling
- Speech Recognition and Synthesis
- Text Readability and Simplification
- Speech and dialogue systems
- Translation Studies and Practices
- Subtitles and Audiovisual Media
- Interpreting and Communication in Healthcare
- Multimodal Machine Learning Applications
- Music and Audio Processing
- Mobile Agent-Based Network Management
- Usability and User Interface Design
- Phonetics and Phonology Research
- Neural Networks and Applications
- Speech and Audio Processing
- Software Engineering Research
- Digital Humanities and Scholarship
- Folklore, Mythology, and Literature Studies
- Context-Aware Activity Recognition Systems
- Law, AI, and Intellectual Property
National Institute of Information and Communications Technology
2023
Charles University
2018-2021
Karlsruhe Institute of Technology
2021
University of Edinburgh
2021
Saarland University
2017
Dominik Macháček, Raj Dabre, Ondřej Bojar. Proceedings of the 13th International Joint Conference on Natural Language Processing and 3rd Asia-Pacific Chapter Association for Computational Linguistics: System Demonstrations. 2023.
We describe our NMT systems submitted to the WMT19 shared task in English→Czech news translation. Our are based on Transformer model implemented either Tensor2Tensor (T2T) or Marian framework. aimed at improving adequacy and coherence of translated documents by enlarging context source target. Instead translating each sentence independently, we split document into possibly overlapping multi-sentence segments. In case T2T implementation, this "document-level"-trained system achieves a +0.6...
Ondřej Bojar, Dominik Macháček, Sangeet Sagar, Otakar Smrž, Jonáš Kratochvíl, Peter Polák, Ebrahim Ansari, Mohammad Mahmoudi, Rishu Kumar, Dario Franceschini, Chiara Canton, Ivan Simonini, Thai-Son Nguyen, Felix Schneider, Sebastian Stüker, Alex Waibel, Barry Haddow, Rico Sennrich, Philip Williams. Proceedings of the 16th Conference European Chapter Association for Computational Linguistics: System Demonstrations. 2021.
The utility of linguistic annotation in neural machine translation seemed to had been established past papers. experiments were however limited recurrent sequence-to-sequence architectures and relatively small data settings. We focus on the state-of-the-art Transformer model use comparably larger corpora. Specifically, we try promote knowledge source-side syntax using multi-task learning either through simple manipulation techniques or a dedicated component. In particular, train one...
Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao. Proceedings of the 17th International Conference on Spoken Language Translation. 2020.
Simultaneous speech-to-text translation (SimulST) translates source-language speech into target-language text concurrently with the speaker's speech, ensuring low latency for better user comprehension. Despite its intended application to unbounded most research has focused on human pre-segmented simplifying task and overlooking significant challenges. This narrow focus, coupled widespread terminological inconsistencies, is limiting applicability of outcomes real-world applications,...
In this paper we describe the CUNI translation system used for unsupervised news shared task of ACL 2019 Fourth Conference on Machine Translation (WMT19). We follow strategy Artetxe ae at. (2018b), creating a seed phrase-based where phrase table is initialized from cross-lingual embedding mappings trained monolingual data, followed by neural machine synthetic parallel data. The corpus was produced tuned PBMT model refined through iterative back-translation. further focus handling named...
There have been several meta-evaluation studies on the correlation between human ratings and offline machine translation (MT) evaluation metrics such as BLEU, chrF2, BertScore COMET. These used to evaluate simultaneous speech (SST) but their correlations with of SST, which has recently collected Continuous Ratings (CR), are unclear. In this paper, we leverage evaluations candidate systems submitted English-German SST task at IWSLT 2022 conduct an extensive analysis CR aforementioned metrics....
We describe work done in the field of folkloristics and consisting creating ontologies based on well-established studies proposed by "classical" folklorists.This is supporting availability a huge amount digital structured knowledge folktales to humanists.The ontological encoding past current motif-indexation classification systems for was first step limited English language data.This led us focus making those newly generated formal sources available few more languages, like German, Russian...
In simultaneous speech translation, one can vary the size of output window, system latency and sometimes allowed level rewriting. The effect these properties on readability comprehensibility has not been tested with modern neural translation systems. this work, we propose an evaluation method investigate effects comprehension user preferences. It is a pilot study 14 users 2 hours German documentaries or speeches online translations into Czech. We collect continuous feedback answers factual...
Our book "The Reality of Multi-Lingual Machine Translation" discusses the benefits and perils using more than two languages in machine translation systems. While focused on particular task sequence-to-sequence processing multi-task learning, targets somewhat beyond area natural language processing. is for us a prime example deep learning applications where human skills capabilities are taken as benchmark that many try to match surpass. We document some gains observed multi-lingual may result...
In this paper, we present our submission to the Non-Native Speech Translation Task for IWSLT 2020. Our main contribution is a proposed speech recognition pipeline that consists of an acoustic model and phoneme-to-grapheme model. As intermediate representation, utilize phonemes. We demonstrate surpasses commercially used automatic (ASR) submit it into ASR track. complement with off-the-shelf MT systems take part also in translation
Some methods of automatic simultaneous translation a long-form speech allow revisions outputs, trading accuracy for low latency. Deploying these systems users faces the problem presenting subtitles in limited space, such as two lines on television screen. The must be shown promptly, incrementally, and with adequate time reading. We provide an algorithm subtitling. Furthermore, we propose way how to estimate overall usability combination subtitling by measuring quality, latency, stability...
This paper is an ELITR system submission for the non-native speech translation task at IWSLT 2020. We describe systems offline ASR, real-time and our cascaded approach to SLT SLT. select primary candidates from a pool of pre-existing systems, develop new end-to-end general ASR system, hybrid trained on speech. The provided small validation set prevents us carrying out complex validation, but we submit all unselected contrastive evaluation test set.
We describe our NMT systems submitted to the WMT19 shared task in English-Czech news translation. Our are based on Transformer model implemented either Tensor2Tensor (T2T) or Marian framework. aimed at improving adequacy and coherence of translated documents by enlarging context source target. Instead translating each sentence independently, we split document into possibly overlapping multi-sentence segments. In case T2T implementation, this "document-level"-trained system achieves a $+0.6$...
Automatic speech translation is sensitive to recognition errors, but in a multilingual scenario, the same content may be available various languages via simultaneous interpreting, dubbing or subtitling. In this paper, we hypothesize that leveraging multiple sources will improve quality if complement one another terms of correct information they contain. To end, first show on 10-hour ESIC corpus, ASR errors original English and its interpreting into German Czech are mutually independent. We...
Whisper is one of the recent state-of-the-art multilingual speech recognition and translation models, however, it not designed for real time transcription. In this paper, we build on top create Whisper-Streaming, an implementation real-time transcription Whisper-like models. Whisper-Streaming uses local agreement policy with self-adaptive latency to enable streaming We show that achieves high quality 3.3 seconds unsegmented long-form test set, demonstrate its robustness practical usability...
Automatic speech translation is sensitive to recognition errors, but in a multilingual scenario, the same content may be available various languages via simultaneous interpreting, dubbing or subtitling. In this paper, we hypothesize that leveraging multiple sources will improve quality if complement one another terms of correct information they contain. To end, first show on 10-hour ESIC corpus, ASR errors original English and its interpreting into German Czech are mutually independent. We...