Elizabeth Salesky

ORCID: 0000-0001-6765-1447
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Speech Recognition and Synthesis
  • Speech and dialogue systems
  • Multimodal Machine Learning Applications
  • Language and cultural evolution
  • Text Readability and Simplification
  • Music and Audio Processing
  • Phonetics and Phonology Research
  • Semantic Web and Ontologies
  • Biomedical Text Mining and Ontologies
  • Machine Learning and Data Classification
  • Data Quality and Management
  • Authorship Attribution and Profiling
  • Web Data Mining and Analysis
  • Speech and Audio Processing
  • Seismology and Earthquake Studies
  • Digital Humanities and Scholarship
  • Hand Gesture Recognition Systems
  • GNSS positioning and interference
  • Subtitles and Audiovisual Media
  • Linguistic Variation and Morphology
  • Hearing Impairment and Communication
  • Advanced Data Processing Techniques
  • Domain Adaptation and Few-Shot Learning

Johns Hopkins University
2020-2023

Microsoft (United States)
2023

University of Copenhagen
2020-2023

Charles University
2023

The University of Melbourne
2020-2022

Carnegie Mellon University
2013-2022

Carleton University
2022

Georgetown University
2022

Yale University
2022

University of Florida
2022

Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John...

10.18653/v1/2022.iwslt-1.10 article EN cc-by 2022-01-01

Milind Agarwal, Sweta Agrawal, Antonios Anastasopoulos, Luisa Bentivogli, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, Mingda Chen, William Khalid Choukri, Alexandra Chronopoulou, Anna Currey, Thierry Declerck, Qianqian Dong, Kevin Duh, Yannick Estève, Marcello Federico, Souhir Gahbiche, Barry Haddow, Benjamin Hsu, Phu Mon Htut, Hirofumi Inaguma, Dávid Javorský, John Judge, Yasumasa Kano, Tom Ko, Rishu Kumar, Pengwei Li, Xutai Ma, Prashant Mathur, Evgeny...

10.18653/v1/2023.iwslt-1.1 article EN cc-by 2023-01-01

Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ondřej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian Stüker, Marco Turchi, Alexander Waibel, Changhan Wang. Proceedings of the 17th International Conference on Spoken Language Translation. 2020.

10.18653/v1/2020.iwslt-1.1 article EN cc-by 2020-01-01

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and translation (ST) research across many non-English source languages.The corpus is a collection of audio recordings from talks in 8 languages.We segment transcripts into sentences align them sourcelanguage target-language translations.The released along with open-sourced code enabling extension new languages as they become available.Our creation methodology can be applied more than previous work, creates...

10.21437/interspeech.2021-11 article EN Interspeech 2022 2021-08-27

Ekaterina Vylomova, Jennifer White, Elizabeth Salesky, Sabrina J. Mielke, Shijie Wu, Edoardo Maria Ponti, Rowan Hall Maudslay, Ran Zmigrod, Josef Valvoda, Svetlana Toldova, Francis Tyers, Elena Klyachko, Ilya Yegorov, Natalia Krizhanovsky, Paula Czarnowska, Irene Nikkarinen, Andrew Tiago Pimentel, Lucas Torroba Hennigen, Christo Kirov, Garrett Nicolai, Adina Williams, Antonios Anastasopoulos, Hilaria Cruz, Eleanor Chodroff, Ryan Cotterell, Miikka Silfverberg, Mans Hulden. Proceedings of the...

10.18653/v1/2020.sigmorphon-1.1 article EN cc-by 2020-01-01

Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner. Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021). 2021.

10.18653/v1/2021.iwslt-1.1 article EN cc-by 2021-01-01

What are the units of text that we want to model? From bytes multi-word expressions, can be analyzed and generated at many granularities. Until recently, most natural language processing (NLP) models operated over words, treating those as discrete atomic tokens, but starting with byte-pair encoding (BPE), subword-based approaches have become dominant in areas, enabling small vocabularies while still allowing for fast inference. Is end road character-level model or byte-level processing? In...

10.48550/arxiv.2112.10508 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Khuyagbaatar Batsuren Omer Goldman Salam Khalifa Nizar Habash Witold Kieraś and 90 more Gábor Bella Brian E. Leonard Garrett Nicolai Kyle Gorman Yustinus Ghanggo Ate Maria Ryskina Sabrina J. Mielke Elena Budianskaya Charbel El-Khaissi Tiago Pimentel Michael Gasser William S. Lane Mohit Raj Matt Coler Jaime Rafael Montoya Samame Delio Siticonatzi Camaiteri Esaú Zumaeta Rojas Didier López Francis Arturo Oncevay Juan López Bautista Gema Villegas Lucas Torroba Hennigen Adam Ek David Guriel Peter Dirix Jean-Philippe Bernardy Andrey Scherbakov Аziyana V. Bayyr-ool Antonios Anastasopoulos Roberto Zariquiey Karina Sheifer Sofya Ganieva Hilaria Cruz Ritván Karahóǧa Στέλλα Μαρκαντωνάτου George Pavlidis Matvey Plugaryov Elena Klyachko Ali Salehi Candy Angulo Jatayu Baxi Andrew Krizhanovsky Natalia Krizhanovskaya Elizabeth Salesky Clara Vania Sardana Ivanova Jennifer Duffield White Rowan Hall Maudslay Josef Valvoda Ran Zmigrod Paula Czarnowska Irene Nikkarinen Aelita Salchak Brijesh Bhatt Christopher Straughn Zoey Liu Jonathan North Washington Yuval Pinter Duygu Ataman Marcin Woliński Totok Suhardijanto Anna Yablonskaya Niklas Stoehr Hossep Dolatian Zahroh Nuriah Shyam Ratan Francis M. Tyers Edoardo Maria Ponti Grant Aiton Aryaman Arora Richard J. Hatcher Ritesh Kumar Jeremiah Young Daria Rodionova Anastasia Yemelina Taras Andrushko Igor Marchenko Polina Mashkovtseva Alexandra Serova Emily Prud’hommeaux Maria Nepomniashchaya Fausto Giunchiglia Eleanor Chodroff Mans Hulden Miikka Silfverberg Arya D. McCarthy David Yarowsky Ryan Cotterell Reut Tsarfaty Ekaterina Vylomova

The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. comprises two major thrusts: language-independent feature schema rich annotation and type-level resource annotated data in languages realizing that schema. This paper presents the expansions improvements made on several fronts over last couple years (since McCarthy et al. (2020)). Collaborative efforts...

10.48550/arxiv.2205.03608 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Previous work on end-to-end translation from speech has primarily used frame-level features as representations, which creates longer, sparser sequences than text. We show that a naive method to create compressed phoneme-like representations is far more effective and efficient for traditional features. Specifically, we generate phoneme labels frames average consecutive with the same label shorter, higher-level source translation. see improvements of up 5 BLEU both our high low resource...

10.18653/v1/p19-1179 article EN cc-by 2019-01-01

Prior work has explored directly regularizing the output distributions of probabilistic models to alleviate peaky (i.e. over-confident) predictions, a common sign overfitting. This class techniques, which label smoothing is one, connection entropy regularization. Despite consistent success across architectures and data sets in language generation tasks, two problems remain open: (1) there little understanding underlying effects regularizers have on models, (2) full space regularization...

10.18653/v1/2020.acl-main.615 article EN cc-by 2020-01-01

This paper reports on the shared tasks organized by 21st IWSLT Conference. The address 7 scientific challenges in spoken language translation: simultaneous and offline translation, automatic subtitling dubbing, speech-to-speech dialect low-resource speech Indic languages. attracted 18 teams whose submissions are documented 26 system papers. growing interest towards translation is also witnessed constantly increasing number of task organizers contributors to overview paper, almost evenly...

10.48550/arxiv.2411.05088 preprint EN arXiv (Cornell University) 2024-11-07

Machine translation models have discrete vocabularies and commonly use subword segmentation techniques to achieve an ‘open vocabulary.’ This approach relies on consistent correct underlying unicode sequences, makes susceptible degradation from common types of noise variation. Motivated by the robustness human language processing, we propose visual text representations, which dispense with a finite set embeddings in favor continuous created processing visually rendered sliding windows. We...

10.18653/v1/2021.emnlp-main.576 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

As large language models (LLM) become more and capable in languages other than English, it is important to collect benchmark datasets order evaluate their multilingual performance, including on tasks like machine translation (MT). In this work, we extend the WMT24 dataset cover 55 by collecting new human-written references post-edits for 46 dialects addition of 8 out 9 original dataset. The covers four domains: literary, news, social, speech. We a variety MT providers LLMs collected using...

10.48550/arxiv.2502.12404 preprint EN arXiv (Cornell University) 2025-02-17

End-to-end models for speech translation (ST) more tightly couple recognition (ASR) and machine (MT) than a traditional cascade of separate ASR MT models, with simpler model architectures the potential reduced error propagation. Their performance is often assumed to be superior, though in many conditions this not yet case. We compare cascaded end-to-end across high, medium, low-resource conditions, show that cascades remain stronger baselines. Further, we introduce two methods incorporate...

10.18653/v1/2020.acl-main.217 article EN 2020-01-01

Transformer models are powerful sequence-to-sequence architectures that capable of directly mapping speech inputs to transcriptions or translations.However, the mechanism for modeling positions in this model was tailored text modeling, and thus is less ideal acoustic inputs.In work, we adapt relative position encoding scheme Speech Transformer, where key addition distance between input states self-attention network.As a result, network can better variable distributions present data.Our...

10.21437/interspeech.2020-2526 article EN Interspeech 2022 2020-10-25

Language models are defined over a finite set of inputs, which creates vocabulary bottleneck when we attempt to scale the number supported languages. Tackling this results in trade-off between what can be represented embedding matrix and computational issues output layer. This paper introduces PIXEL, Pixel-based Encoder Language, suffers from neither these issues. PIXEL is pretrained language model that renders text as images, making it possible transfer representations across languages...

10.48550/arxiv.2207.06991 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Elizabeth Salesky, Matthias Sperber, Alexander Waibel. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1285 article EN 2019-01-01

When translating from speech, special consideration for conversational speech phenomena such as disfluencies is necessary. Most machine translation training data consists of well-formed written texts, causing issues when spontaneous speech. Previous work has introduced an intermediate step between recognition (ASR) and (MT) to remove disfluencies, making the better-matched typical text significantly improving performance. However, with rise end-to-end systems, this must be incorporated into...

10.1109/slt.2018.8639661 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2018-12-01

While there exist scores of natural languages, each with its unique features and idiosyncrasies, they all share a unifying theme: enabling human communication. We may thus reasonably predict that cognition shapes how these languages evolve are used. Assuming the capacity to process information is roughly constant across populations, we expect surprisal–duration trade-off arise both within languages. analyse this using corpus 600 and, after controlling for several potential confounds, find...

10.18653/v1/2021.emnlp-main.73 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

Johannes Bjerva, Elizabeth Salesky, Sabrina J. Mielke, Aditi Chaudhary, Giuseppe G. A. Celano, Edoardo Maria Ponti, Ekaterina Vylomova, Ryan Cotterell, Isabelle Augenstein. Proceedings of the Second Workshop on Computational Research in Linguistic Typology. 2020.

10.18653/v1/2020.sigtyp-1.1 article EN cc-by 2020-01-01

Elizabeth Salesky, Eleanor Chodroff, Tiago Pimentel, Matthew Wiesner, Ryan Cotterell, Alan W Black, Jason Eisner. Proceedings of the 58th Annual Meeting Association for Computational Linguistics. 2020.

10.18653/v1/2020.acl-main.415 article EN cc-by 2020-01-01
Coming Soon ...