Shaomei Wu

ORCID: 0000-0003-1104-4116
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Stuttering Research and Treatment
  • Phonetics and Phonology Research
  • Neurobiology of Language and Bilingualism
  • Assistive Technology in Communication and Mobility
  • Speech Recognition and Synthesis
  • Social Media and Politics
  • Multilingual Education and Policy
  • Media Influence and Politics
  • Language, Discourse, Communication Strategies
  • Employee Welfare and Language Studies

Existing videoconferencing (VC) technologies are often optimized for productivity and efficiency, with little support the "soft side" of VC meetings such as empathy, authenticity, belonging, emotional connections. This paper presents findings from a 15-month long autoethnographic study experiences by first author, person who stutters (PWS). Our research shed light on hidden costs PWS, uncovering substantial cognitive efforts that other meeting attendants unaware of. Recognizing...

10.1145/3613904.3642746 article EN cc-by 2024-05-11

The rapid advancements in speech technologies over the past two decades have led to human-level performance tasks like automatic recognition (ASR) for fluent speech. However, efficacy of these models diminishes when applied atypical speech, such as stuttering. This paper introduces AS-70, first publicly available Mandarin stuttered dataset, which stands out largest dataset its category. Encompassing conversational and voice command reading AS-70 includes verbatim manual transcription,...

10.21437/interspeech.2024-918 article EN Interspeech 2022 2024-09-01

Despite the widespread adoption of Automatic Speech Recognition (ASR) models in voice-operated products and conversational AI agents, current ASR perform poorly for people who stutter. One primary cause performance disparity is lack representative stuttered speech data during development models. This work introduces first dataset Mandarin Chinese, created by a grassroots community Chinese-speaking stutter to facilitate inclusive fair AI. Collected from 72 speakers with wide range stuttering...

10.1145/3613905.3650950 article EN public-domain 2024-05-02

This work studies the experiences of people who stutter (PWS) with videoconferencing (VC) and VC technologies. Our interview study 13 adults uncovers extra challenges introduced by current platforms to stutter. While some are a direct result characteristics stuttering (e.g. people/systems mistaking pauses as end turn), bigger yet less visible challenge comes significant amount emotional cognitive effort required manage one's speech identity over VC, in which people's existing communication...

10.1145/3544548.3580788 article EN 2023-04-19

Abstract Literacy is one of the most fundamental skills for people to access and navigate today’s digital environment. This work systematically studies language literacy online populations more than 160 countries regions across world, including many low-resourced where official data are particularly sparse. Leveraging public on Facebook, we develop a population-level estimate population that based aggregated de-identified posts written by adult Facebook users globally, significantly...

10.1140/epjds/s13688-023-00388-4 article EN cc-by EPJ Data Science 2023-05-08

The rapid advancements in speech technologies over the past two decades have led to human-level performance tasks like automatic recognition (ASR) for fluent speech. However, efficacy of these models diminishes when applied atypical speech, such as stuttering. This paper introduces AS-70, first publicly available Mandarin stuttered dataset, which stands out largest dataset its category. Encompassing conversational and voice command reading AS-70 includes verbatim manual transcription,...

10.48550/arxiv.2406.07256 preprint EN arXiv (Cornell University) 2024-06-11

This paper documents the process undertaken by StammerTalk , a grassroots community of Chinese-speaking people who stutter, to autonomously collect and curate stuttered speech data for more inclusive AI models. While with disabilities are often excluded or treated merely as subjects collection, our work introduces new model disability collection in which exerts agency control over their personal data-driven experiences. Our ethnographic show that community-led not only produces needed...

10.1145/3687014 article EN other-oa Proceedings of the ACM on Human-Computer Interaction 2024-11-07
Coming Soon ...