NFDI4DS | UHH-SEMS - Publication Details

Eric Fosler‐Lussier

ORCID: 0000-0001-8004-5169

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5056667180

Research Areas

Natural Language Processing Techniques
Speech Recognition and Synthesis
Topic Modeling
Speech and dialogue systems
Speech and Audio Processing
Biomedical Text Mining and Ontologies
Music and Audio Processing
Phonetics and Phonology Research
Multi-Agent Systems and Negotiation
Advanced Data Compression Techniques
Text and Document Classification Technologies
Text Readability and Simplification
Machine Learning in Healthcare
Blind Source Separation Techniques
Multimodal Machine Learning Applications
Misinformation and Its Impacts
Semantic Web and Ontologies
Intelligent Tutoring Systems and Adaptive Learning
Advanced Text Analysis Techniques
Geographic Information Systems Studies
Infant Health and Development
Software Engineering Research
Language Development and Disorders
Expert finding and Q&A systems
Context-Aware Activity Recognition Systems

The Ohio State University
2016-2025

Nationwide Children's Hospital
2018-2020

Amazon (United States)
2018-2019

University of Science and Technology of China
2019

University of Udine
2018-2019

University of Cambridge
2018-2019

Middle East Technical University
2018-2019

University of Illinois Urbana-Champaign
2018-2019

Delft University of Technology
2018-2019

The University of Tokyo
2018-2019

Effects of disfluencies, predictability, and utterance position on word form variation in English conversation

OPENALEX - Publications

Alan Bell Daniel Jurafsky Eric Fosler‐Lussier Cynthia Girand Michelle Gregory and 1 more

Function words, especially frequently occurring ones such as (the, that, and, and of ), vary widely in pronunciation. Understanding this variation is essential both for cognitive modeling lexical production computer speech recognition synthesis. This study investigates which factors affect the forms function whether they have a fuller pronunciation (e.g., ði, ðæt, ænd, ʌv) or more reduced lenited ðə, ðīt, n, ə). It based on over 8000 occurrences ten most frequent English words 4-h sample...

10.1121/1.1534836 article EN The Journal of the Acoustical Society of America 2003-01-28

Discourse segmentation of multi-party conversation

OPENALEX - Publications

Michel Galley Kathleen McKeown Eric Fosler‐Lussier Hongyan Jing

We present a domain-independent topic segmentation algorithm for multi-party speech. Our feature-based combines knowledge about content using text-based as feature and form linguistic acoustic cues shifts extracted from This uses automatically induced decision rules to combine the different features. The embedded builds on lexical cohesion has performance comparable state-of-the-art algorithms based information. A significant error reduction is obtained by combining two sources.

10.3115/1075096.1075167 article EN 2003-01-01

Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources

OPENALEX - Publications

Joo-Kyung Kim Young‐Bum Kim Ruhi Sarikaya Eric Fosler‐Lussier

Training a POS tagging model with crosslingual transfer learning usually requires linguistic knowledge and resources about the relation between source language target language. In this paper, we introduce cross-lingual for without ancillary such as parallel corpora. The proposed utilizes common BLSTM that enables from other languages, private BLSTMs language-specific representations. is trained language-adversarial training bidirectional modeling auxiliary objectives to better represent...

10.18653/v1/d17-1302 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

Effects of speaking rate and word frequency on pronunciations in convertional speech

OPENALEX - Publications

Eric Fosler‐Lussier Nelson Morgan

10.1016/s0167-6393(99)00035-7 article EN Speech Communication 1999-11-01

Combining multiple estimators of speaking rate

OPENALEX - Publications

N. Morgan Eric Fosler‐Lussier

We report progress in the development of a measure speaking rate that is computed from acoustic signal. The newest form our analysis incorporates multiple estimates rate; besides spectral moment for full-band energy envelope we have previously reported, also used pointwise correlation between pairs compressed sub-band envelopes. complete measure, called mrate, has been compared to reference syllable derived manually transcribed subset Switchboard database. with significantly higher than...

10.1109/icassp.1998.675368 article EN 2002-11-27

Reduction of English function words in switchboard

OPENALEX - Publications

Daniel Jurafsky Alan Bell Eric Fosler‐Lussier Cynthia Girand William D. Raymond

The causes of pronunciation reduction in 8458 occurrences ten frequent English function words a four-hour sample from conversations the Switchboard corpus were examined. Using ordinary linear and logistic regression models, we examined length words, form their vowel (basic, full, or reduced), final obstruent deletion. For all these found strong, independent effects speaking rate, predictability, following word, planning problem disfluencies. results bear on issues speech recognition, models...

10.21437/icslp.1998-801 article EN 4th International Conference on Spoken Language Processing (ICSLP 1996) 1998-11-30

Deep neural network based spectral feature mapping for robust speech recognition

OPENALEX - Publications

Kun Han Yanzhang He Deblin Bagchi Eric Fosler‐Lussier DeLiang Wang

Automatic speech recognition (ASR) systems suffer from performance degradation under noisy and reverberant conditions. In this work, we explore a deep neural network (DNN) based approach for spectral feature mapping corrupted to clean speech. The DNN substantially reduces interference produces estimated features ASR training decoding. We experiment with several different approaches demonstrate that trained predict log filterbank coefficients spectrogram directly can be extremely effective....

10.21437/interspeech.2015-536 article EN Interspeech 2022 2015-09-06

A Non-autoregressive Model for Joint STT and TTS

OPENALEX - Publications

Vishal Sunder Brian Kingsbury George Saon Samuel Thomas Slava Shechtman Hagai Aronowitz and 2 more

In this paper, we take a step towards jointly modeling automatic speech recognition (STT) and synthesis (TTS) in fully non-autoregressive way. We develop novel multimodal framework capable of handling the text modalities as input either individually or together. The proposed model can also be trained with unpaired data owing to its nature. further propose an iterative refinement strategy improve STT TTS performance our such that partial hypothesis at output fed back model, thus iteratively...

10.48550/arxiv.2501.09104 preprint EN arXiv (Cornell University) 2025-01-15

The Ohio Child Speech Corpus

OPENALEX - Publications

Laura Wagner Sharifa Alghowinhem Abeer Alwan Kristina Bowdrie Cynthia Breazeal and 6 more

10.1016/j.specom.2025.103206 article EN cc-by-nc Speech Communication 2025-03-04

A Non-autoregressive Model for Joint STT and TTS

OPENALEX - Publications

Vishal Sunder Brian Kingsbury George Saon Samuel Thomas Slava Shechtman and 3 more

10.1109/icassp49660.2025.10887605 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Knowledge Graph Inference for spoken dialog systems

OPENALEX - Publications

Yi Ma Paul Crook Ruhi Sarikaya Eric Fosler‐Lussier

We propose Inference Knowledge Graph, a novel approach of remapping existing, large scale, semantic knowledge graphs into Markov Random Fields in order to create user goal tracking models that could form part spoken dialog system. Since include both entities and their attributes, the proposed method merges dialog-state-tracking attributes database lookup fulfill users' requests one single unified step. Using graph contains all businesses Bellevue, WA, extracted from Microsoft Satori, we...

10.1109/icassp.2015.7178992 article EN 2015-04-01

Using virtual standardized patients to accurately assess information gathering skills in medical students

OPENALEX - Publications

Kellen Maicher Laura Zimmerman Bruce A. Wilcox Beth W. Liston Holly Cronau and 8 more

Introduction: Practicing a medical history using standardized patients is an essential component of school curricula. Recent advances in technology now allow for newer approaches practicing and assessing communication skills. We describe herein virtual patient (VSP) system that allows students to practice their taking skills receive immediate feedback.Methods: Our VSPs consist artificially intelligent, emotionally responsive 3D characters which communicate with natural language. The...

10.1080/0142159x.2019.1616683 article EN Medical Teacher 2019-06-22

Textual inference for eligibility criteria resolution in clinical trials

OPENALEX - Publications

Chaitanya Shivade Courtney Hebert Marcelo Lopetegui Marie-Catherine de Marneffe Eric Fosler‐Lussier and 1 more

Clinical trials are essential for determining whether new interventions effective. In order to determine the eligibility of patients enroll into these trials, clinical trial coordinators often perform a manual review notes in electronic health record patients. This is very time-consuming and exhausting task. Efforts this process can be expedited if directed toward specific parts text that relevant determination. study, we describe creation dataset used evaluate automated methods capable...

10.1016/j.jbi.2015.09.008 article EN cc-by-nc-nd Journal of Biomedical Informatics 2015-09-14

Improving Transducer-Based Spoken Language Understanding with Self-Conditioned CTC and Knowledge Transfer

OPENALEX - Publications

Vishal Sunder Eric Fosler‐Lussier

In this paper, we propose to improve end-to-end (E2E) spoken language understand (SLU) in an RNN transducer model (RNN-T) by incorporating a joint self-conditioned CTC automatic speech recognition (ASR) objective. Our proposed is akin E2E differentiable cascaded which performs ASR and SLU sequentially ensure that the task conditioned on having self conditioning. This novel modeling of improves performance significantly over just using optimization. We further aligning acoustic embeddings...

10.48550/arxiv.2501.01936 preprint EN arXiv (Cornell University) 2025-01-03

Discriminative training of language models for speech recognition

OPENALEX - Publications

Hong-Kwang Jeff Kuo Eric Fosler‐Lussier Hui Jiang Chin‐Hui Lee

In this paper we describe how discriminative training can be applied to language models for speech recognition. Language are important guide the recognition search, particularly in compensating mistakes acoustic decoding. A frequently used measure of quality is perplexity; however, what more accurate decoding not necessarily having maximum likelihood hypothesis, but rather best separation correct string from competing, acoustically confusible hypotheses. Discriminative help improve purpose...

10.1109/icassp.2002.5743720 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2002-05-01

An overview on automatic speech attribute transcription (ASAT)

OPENALEX - Publications

Chin‐Hui Lee Mark A. Clements Sorin Dusan Eric Fosler‐Lussier Keith Johnson and 2 more

Automatic Speech Attribute Transcription (ASAT), an ITR project sponsored under the NSF grant (IIS-04-27113), is a cross-institute effort involving Georgia Institute of Technology, The Ohio State University, University California at Berkeley, and Rutgers University. This approaches speech recognition from more linguistic perspective: unlike traditional ASR systems, humans detect acoustic auditory cues, weigh combine them to form theories, then process these cognitive hypotheses until...

10.21437/interspeech.2007-509 article EN Interspeech 2022 2007-08-27

Conditional Random Fields for Integrating Local Discriminative Classifiers

OPENALEX - Publications

Jeremy Morris Eric Fosler‐Lussier

Conditional random fields (CRFs) are a statistical framework that has recently gained in popularity both the automatic speech recognition (ASR) and natural language processing communities because of different nature assumptions made predicting sequences labels compared to more traditional hidden Markov model (HMM). In ASR community, CRFs have been employed method similar HMMs, using sufficient statistics input data compute probability label given acoustic input. this paper, we explore...

10.1109/tasl.2008.916057 article EN IEEE Transactions on Audio Speech and Language Processing 2008-01-01

Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition

OPENALEX - Publications

Deblin Bagchi Michael Mandel Zhong-Qiu Wang Yanzhang He Andrew Plummer and 1 more

Automatic Speech Recognition systems suffer from severe performance degradation in the presence of myriad complicating factors such as noise, reverberation, multiple speech sources, recording devices, etc. Previous challenges have sparked much innovation when it comes to designing capable handling these complications. In this spirit, CHiME-3 challenge presents system builders with task recognizing a real-world noisy setting wherein speakers talk an array 6 microphones tablet. order address...

10.1109/asru.2015.7404836 article EN 2015-12-01

A Direct Masking Approach to Robust ASR

OPENALEX - Publications

William M. Hartmann Arun Narayanan Eric Fosler‐Lussier DeLiang Wang

Recently, much work has been devoted to the computation of binary masks for speech segregation. Conventional wisdom in field ASR holds that these cannot be used directly; missing energy significantly affects calculation cepstral features commonly ASR. We show this held belief may a misconception; we demonstrate effectiveness directly using masked data on both small and large vocabulary dataset. In fact, approach, which term direct masking performs comparably two previously proposed feature...

10.1109/tasl.2013.2263802 article EN IEEE Transactions on Audio Speech and Language Processing 2013-05-17

Jointly Embedding Entities and Text with Distant Supervision

OPENALEX - Publications

Denis Newman-Griffis Albert M. Lai Eric Fosler‐Lussier

Learning representations for knowledge base entities and concepts is becoming increasingly important NLP applications. However, recent entity embedding methods have relied on structured resources that are expensive to create new domains corpora. We present a distantly-supervised method jointly learning embeddings of text from an unnanotated corpus, using only list mappings between surface forms. learn open-domain biomedical corpora, compare against prior rely human-annotated or large graph...

10.18653/v1/w18-3026 article EN cc-by 2018-01-01

Comparison of UMLS terminologies to identify risk of heart disease using clinical notes

OPENALEX - Publications

Chaitanya Shivade Pranav Malewadkar Eric Fosler‐Lussier Albert M. Lai

The second track of the 2014 i2b2 challenge asked participants to automatically identify risk factors for heart disease among diabetic patients using natural language processing techniques clinical notes. This paper describes a rule-based system developed combination regular expressions, concepts from Unified Medical Language System (UMLS), and freely-available resources community. With performance (F1=90.7) that is significantly higher than median (F1=87.20) close top performing (F1=92.8),...

10.1016/j.jbi.2015.08.025 article EN cc-by-nc-nd Journal of Biomedical Informatics 2015-09-13

Coming Soon ...