Eric Fosler‐Lussier

ORCID: 0000-0001-8004-5169
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Speech Recognition and Synthesis
  • Topic Modeling
  • Speech and dialogue systems
  • Speech and Audio Processing
  • Biomedical Text Mining and Ontologies
  • Music and Audio Processing
  • Phonetics and Phonology Research
  • Multi-Agent Systems and Negotiation
  • Advanced Data Compression Techniques
  • Text and Document Classification Technologies
  • Text Readability and Simplification
  • Machine Learning in Healthcare
  • Blind Source Separation Techniques
  • Multimodal Machine Learning Applications
  • Misinformation and Its Impacts
  • Semantic Web and Ontologies
  • Intelligent Tutoring Systems and Adaptive Learning
  • Advanced Text Analysis Techniques
  • Geographic Information Systems Studies
  • Infant Health and Development
  • Software Engineering Research
  • Language Development and Disorders
  • Expert finding and Q&A systems
  • Context-Aware Activity Recognition Systems

The Ohio State University
2016-2025

Nationwide Children's Hospital
2018-2020

Amazon (United States)
2018-2019

University of Science and Technology of China
2019

University of Udine
2018-2019

University of Cambridge
2018-2019

Middle East Technical University
2018-2019

University of Illinois Urbana-Champaign
2018-2019

Delft University of Technology
2018-2019

The University of Tokyo
2018-2019

Function words, especially frequently occurring ones such as (the, that, and, and of ), vary widely in pronunciation. Understanding this variation is essential both for cognitive modeling lexical production computer speech recognition synthesis. This study investigates which factors affect the forms function whether they have a fuller pronunciation (e.g., ði, ðæt, ænd, ʌv) or more reduced lenited ðə, ðīt, n, ə). It based on over 8000 occurrences ten most frequent English words 4-h sample...

10.1121/1.1534836 article EN The Journal of the Acoustical Society of America 2003-01-28

We present a domain-independent topic segmentation algorithm for multi-party speech. Our feature-based combines knowledge about content using text-based as feature and form linguistic acoustic cues shifts extracted from This uses automatically induced decision rules to combine the different features. The embedded builds on lexical cohesion has performance comparable state-of-the-art algorithms based information. A significant error reduction is obtained by combining two sources.

10.3115/1075096.1075167 article EN 2003-01-01

Training a POS tagging model with crosslingual transfer learning usually requires linguistic knowledge and resources about the relation between source language target language. In this paper, we introduce cross-lingual for without ancillary such as parallel corpora. The proposed utilizes common BLSTM that enables from other languages, private BLSTMs language-specific representations. is trained language-adversarial training bidirectional modeling auxiliary objectives to better represent...

10.18653/v1/d17-1302 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

We report progress in the development of a measure speaking rate that is computed from acoustic signal. The newest form our analysis incorporates multiple estimates rate; besides spectral moment for full-band energy envelope we have previously reported, also used pointwise correlation between pairs compressed sub-band envelopes. complete measure, called mrate, has been compared to reference syllable derived manually transcribed subset Switchboard database. with significantly higher than...

10.1109/icassp.1998.675368 article EN 2002-11-27

The causes of pronunciation reduction in 8458 occurrences ten frequent English function words a four-hour sample from conversations the Switchboard corpus were examined. Using ordinary linear and logistic regression models, we examined length words, form their vowel (basic, full, or reduced), final obstruent deletion. For all these found strong, independent effects speaking rate, predictability, following word, planning problem disfluencies. results bear on issues speech recognition, models...

10.21437/icslp.1998-801 article EN 4th International Conference on Spoken Language Processing (ICSLP 1996) 1998-11-30

Automatic speech recognition (ASR) systems suffer from performance degradation under noisy and reverberant conditions. In this work, we explore a deep neural network (DNN) based approach for spectral feature mapping corrupted to clean speech. The DNN substantially reduces interference produces estimated features ASR training decoding. We experiment with several different approaches demonstrate that trained predict log filterbank coefficients spectrogram directly can be extremely effective....

10.21437/interspeech.2015-536 article EN Interspeech 2022 2015-09-06

In this paper, we take a step towards jointly modeling automatic speech recognition (STT) and synthesis (TTS) in fully non-autoregressive way. We develop novel multimodal framework capable of handling the text modalities as input either individually or together. The proposed model can also be trained with unpaired data owing to its nature. further propose an iterative refinement strategy improve STT TTS performance our such that partial hypothesis at output fed back model, thus iteratively...

10.48550/arxiv.2501.09104 preprint EN arXiv (Cornell University) 2025-01-15

10.1109/icassp49660.2025.10887605 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

We propose Inference Knowledge Graph, a novel approach of remapping existing, large scale, semantic knowledge graphs into Markov Random Fields in order to create user goal tracking models that could form part spoken dialog system. Since include both entities and their attributes, the proposed method merges dialog-state-tracking attributes database lookup fulfill users' requests one single unified step. Using graph contains all businesses Bellevue, WA, extracted from Microsoft Satori, we...

10.1109/icassp.2015.7178992 article EN 2015-04-01

Introduction: Practicing a medical history using standardized patients is an essential component of school curricula. Recent advances in technology now allow for newer approaches practicing and assessing communication skills. We describe herein virtual patient (VSP) system that allows students to practice their taking skills receive immediate feedback.Methods: Our VSPs consist artificially intelligent, emotionally responsive 3D characters which communicate with natural language. The...

10.1080/0142159x.2019.1616683 article EN Medical Teacher 2019-06-22

Clinical trials are essential for determining whether new interventions effective. In order to determine the eligibility of patients enroll into these trials, clinical trial coordinators often perform a manual review notes in electronic health record patients. This is very time-consuming and exhausting task. Efforts this process can be expedited if directed toward specific parts text that relevant determination. study, we describe creation dataset used evaluate automated methods capable...

10.1016/j.jbi.2015.09.008 article EN cc-by-nc-nd Journal of Biomedical Informatics 2015-09-14

In this paper, we propose to improve end-to-end (E2E) spoken language understand (SLU) in an RNN transducer model (RNN-T) by incorporating a joint self-conditioned CTC automatic speech recognition (ASR) objective. Our proposed is akin E2E differentiable cascaded which performs ASR and SLU sequentially ensure that the task conditioned on having self conditioning. This novel modeling of improves performance significantly over just using optimization. We further aligning acoustic embeddings...

10.48550/arxiv.2501.01936 preprint EN arXiv (Cornell University) 2025-01-03

In this paper we describe how discriminative training can be applied to language models for speech recognition. Language are important guide the recognition search, particularly in compensating mistakes acoustic decoding. A frequently used measure of quality is perplexity; however, what more accurate decoding not necessarily having maximum likelihood hypothesis, but rather best separation correct string from competing, acoustically confusible hypotheses. Discriminative help improve purpose...

10.1109/icassp.2002.5743720 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2002-05-01

Automatic Speech Attribute Transcription (ASAT), an ITR project sponsored under the NSF grant (IIS-04-27113), is a cross-institute effort involving Georgia Institute of Technology, The Ohio State University, University California at Berkeley, and Rutgers University. This approaches speech recognition from more linguistic perspective: unlike traditional ASR systems, humans detect acoustic auditory cues, weigh combine them to form theories, then process these cognitive hypotheses until...

10.21437/interspeech.2007-509 article EN Interspeech 2022 2007-08-27

Conditional random fields (CRFs) are a statistical framework that has recently gained in popularity both the automatic speech recognition (ASR) and natural language processing communities because of different nature assumptions made predicting sequences labels compared to more traditional hidden Markov model (HMM). In ASR community, CRFs have been employed method similar HMMs, using sufficient statistics input data compute probability label given acoustic input. this paper, we explore...

10.1109/tasl.2008.916057 article EN IEEE Transactions on Audio Speech and Language Processing 2008-01-01

Automatic Speech Recognition systems suffer from severe performance degradation in the presence of myriad complicating factors such as noise, reverberation, multiple speech sources, recording devices, etc. Previous challenges have sparked much innovation when it comes to designing capable handling these complications. In this spirit, CHiME-3 challenge presents system builders with task recognizing a real-world noisy setting wherein speakers talk an array 6 microphones tablet. order address...

10.1109/asru.2015.7404836 article EN 2015-12-01

Recently, much work has been devoted to the computation of binary masks for speech segregation. Conventional wisdom in field ASR holds that these cannot be used directly; missing energy significantly affects calculation cepstral features commonly ASR. We show this held belief may a misconception; we demonstrate effectiveness directly using masked data on both small and large vocabulary dataset. In fact, approach, which term direct masking performs comparably two previously proposed feature...

10.1109/tasl.2013.2263802 article EN IEEE Transactions on Audio Speech and Language Processing 2013-05-17

Learning representations for knowledge base entities and concepts is becoming increasingly important NLP applications. However, recent entity embedding methods have relied on structured resources that are expensive to create new domains corpora. We present a distantly-supervised method jointly learning embeddings of text from an unnanotated corpus, using only list mappings between surface forms. learn open-domain biomedical corpora, compare against prior rely human-annotated or large graph...

10.18653/v1/w18-3026 article EN cc-by 2018-01-01

The second track of the 2014 i2b2 challenge asked participants to automatically identify risk factors for heart disease among diabetic patients using natural language processing techniques clinical notes. This paper describes a rule-based system developed combination regular expressions, concepts from Unified Medical Language System (UMLS), and freely-available resources community. With performance (F1=90.7) that is significantly higher than median (F1=87.20) close top performing (F1=92.8),...

10.1016/j.jbi.2015.08.025 article EN cc-by-nc-nd Journal of Biomedical Informatics 2015-09-13
Coming Soon ...