Benjamin Peloquin

ORCID: 0000-0002-4876-9906
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Topic Modeling
  • Speech Recognition and Synthesis
  • scientometrics and bibliometrics research
  • Language and cultural evolution
  • Meta-analysis and systematic reviews
  • Language Development and Disorders
  • Scientific Computing and Data Management
  • Opinion Dynamics and Social Influence
  • Intelligent Tutoring Systems and Adaptive Learning
  • Digital Communication and Language
  • Philosophy and History of Science
  • Advanced Text Analysis Techniques
  • Reading and Literacy Development
  • Neurobiology of Language and Bilingualism
  • Music and Audio Processing
  • Language, Discourse, Communication Strategies

Stanford University
2018-2021

Creating the Babel Fish, a tool that helps individuals translate speech between any two languages, requires advanced technological innovation and linguistic expertise. Although conventional speech-to-speech translation systems composed of multiple subsystems performing in cascaded fashion exist1–3, scalable high-performing unified systems4,5 remain underexplored. To address this gap, here we introduce SEAMLESSM4T–Massively Multilingual Multimodal Machine Translation–a single model supports...

10.1038/s41586-024-08359-z article EN cc-by-nc-nd Nature 2025-01-15

For any scientific report, repeating the original analyses upon data should yield outcomes. We evaluated analytic reproducibility in 25 Psychological Science articles awarded open badges between 2014 and 2015. Initially, 16 (64%, 95% confidence interval [43,81]) contained at least one 'major numerical discrepancy' (>10% difference) prompting us to request input from authors. Ultimately, target values were reproducible without author involvement for 9 (36% [20,59]) articles; with 6 (24%...

10.1098/rsos.201494 article EN cc-by Royal Society Open Science 2021-01-01

Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models enable end-to-end expressive and multilingual translations in streaming fashion. First, contribute an improved version the massively multimodal SeamlessM4T model-SeamlessM4T v2. This newer model, incorporating updated UnitY2 framework, was trained on more low-resource language...

10.48550/arxiv.2312.05187 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

Expressive speech-to-speech translation (S2ST) aims to transfer prosodic attributes of source speech target while maintaining accuracy. Existing research in expressive S2ST is limited, typically focusing on a single expressivity aspect at time. Likewise, this area lacks standard evaluation protocols and well-curated benchmark datasets. In work, we propose holistic cascade system for S2ST, combining multiple prosody techniques previously considered only isolation. We curate test set the TV...

10.1109/icassp49357.2023.10096183 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

What does it take to create the Babel Fish, a tool that can help individuals translate speech between any two languages? While recent breakthroughs in text-based models have pushed machine translation coverage beyond 200 languages, unified speech-to-speech yet achieve similar strides. More specifically, conventional systems rely on cascaded perform progressively, putting high-performing out of reach. To address these gaps, we introduce SeamlessM4T, single model supports translation,...

10.48550/arxiv.2308.11596 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

In conversation, individual utterances are almost always ambiguous, with this ambiguity resolved by context and discourse history (common ground). One important cue for disambiguation is the topic under discussion a particular partner (e.g., "want to pick?" means something different in conversation bluegrass musician vs. book club partner). Here, we investigated 2- 5-year-old American English-speaking children's (N = 131) reliance on conversational topics specific partners interpret...

10.1111/desc.13049 article EN cc-by-nc Developmental Science 2020-10-16

Abstract Despite their diversity, languages around the world share a consistent set of properties and distributional regularities. For example, distribution word frequencies, syntactic dependency lengths, presence ambiguity are all remarkably across languages. We discuss framework for studying how these system‐level emerge from local, in‐the‐moment interactions rational, pragmatic speakers listeners. To do so, we derive novel objective function measuring communicative efficiency linguistic...

10.1111/tops.12489 article EN publisher-specific-oa Topics in Cognitive Science 2020-01-01

In conversation, individual utterances are almost always ambiguous, with this ambiguity resolved by context and discourse history (common ground). One important cue for disambiguation is the topic under discussion a particular partner (e.g., “want to pick?” means something different in conversation bluegrass musician vs. book club partner). Here, we investigated 2- 5-year-old American English-speaking children’s (N = 131) reliance on conversational topics specific partners interpret...

10.31234/osf.io/gkhez preprint EN 2020-02-21

In this paper, we propose a textless acoustic model with self-supervised distillation strategy for noise-robust expressive speech-to-speech translation (S2ST). Recently proposed S2ST systems have achieved impressive expressivity preservation performances by cascading unit-to-speech (U2S) generator to the speech-to-unit model. However, these are vulnerable presence of noise in input speech, which is an assumption real-world scenarios. To address limitation, U2S that incorporates no label...

10.48550/arxiv.2406.02733 preprint EN arXiv (Cornell University) 2024-06-04

Despite their diversity, languages around the world share a consistent set of properties and distributional regularities. For example, distribution word frequencies, syntactic dependency lengths, presence ambigu- ity are all remarkably across languages. We dis- cuss framework for studying how these system-level proper- ties emerge from local, in-the-moment interactions rational, pragmatic speakers listeners. To do so, we derive novel objective function measuring communicative efficiency...

10.31234/osf.io/8f9gv article EN 2019-02-03

Expressive speech-to-speech translation (S2ST) aims to transfer prosodic attributes of source speech target while maintaining accuracy. Existing research in expressive S2ST is limited, typically focusing on a single expressivity aspect at time. Likewise, this area lacks standard evaluation protocols and well-curated benchmark datasets. In work, we propose holistic cascade system for S2ST, combining multiple prosody techniques previously considered only isolation. We curate test set the TV...

10.48550/arxiv.2301.10606 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...