Su-Lin Wu

ORCID: 0009-0009-0425-1177
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Speech Recognition and Synthesis
  • Music and Audio Processing
  • Recommender Systems and Techniques
  • Topic Modeling
  • Advanced Bandit Algorithms Research
  • Expert finding and Q&A systems
  • Image and Video Quality Assessment
  • Speech and dialogue systems
  • Algorithms and Data Compression
  • Advanced Data Compression Techniques
  • Information Retrieval and Search Behavior
  • Consumer Market Behavior and Pricing
  • Image and Signal Denoising Methods
  • Advanced Graph Neural Networks
  • Smart Grid Energy Management
  • Natural Language Processing Techniques
  • Neural Networks and Applications
  • Phonetics and Phonology Research

Google (United States)
2021-2024

Yahoo (United States)
2011-2012

Nuance Communications (Austria)
2003-2004

International Computer Science Institute
1999-2002

University of California, Berkeley
2002

Effective exploration is believed to positively influence the long-term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests often measure neutral or even negative engagement metrics while failing capture benefits. We here introduce new experiment designs formally quantify value of by examining effects content corpus, and connecting corpus growth from real-world experiments. Once established values exploration, we...

10.1145/3616855.3635833 article EN other-oa 2024-03-04

Including information distributed over intervals of syllabic duration (100-250 ms) may greatly improve the performance automatic speech recognition (ASR) systems. ASR systems primarily use representations and units covering phonetic durations (40-100 ms). Humans certainly at time scales, but results from psychoacoustics psycholinguistics highlight crucial role syllable, syllable-length intervals, in perception. We compare three systems: a baseline system that uses phone-scale units, an...

10.1109/icassp.1998.675366 article EN 2002-11-27

Reinforcement Learning (RL) has been sought after to bring next-generation recommender systems further improve user experience on recommendation platforms. While the exploration-exploitation tradeoff is foundation of RL research, value exploration in (RL-based) less well understood. Exploration, commonly seen as a tool reduce model uncertainty regions sparse interaction/feedback, believed cost short term, while indirect benefit better quality arrives at later time. We focus another aspect...

10.1145/3460231.3474236 article EN 2021-09-13

We examine the proposition that knowledge of timing syllabic onsets may be useful in improving performance speech recognition systems. A method estimating location syllable derived from analysis energy trajectories critical band channels has been developed, and a syllable-based decoder designed implemented incorporates this onset information into process. For small, continuous task addition artificial (derived advance word transcriptions) lowers error rate by 38%. Incorporating...

10.1109/icassp.1997.596105 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2002-11-22

Combining knowledge derived from both syllable(100-250 ms) and phone-length (40-100 intervals in the automatic speech recognition process can yield performance superior to that obtained using information a single time scale alone. The results are particularly pronounced for reverberant test conditions have not been incorporated into training set. In present study, phoneand syllable-based systems combined at three distinct levels of — frame, syllable entire utterance. Each strategy...

10.21437/icslp.1998-305 article EN 4th International Conference on Spoken Language Processing (ICSLP 1996) 1998-11-30

We have developed a statistical model of speech (based on auditory perceptual criteria) that avoids number current constraining assumptions for recognition systems, particularly the as sequence stationary segments consisting uncorrelated acoustic vectors. further wish to focus modeling power perceptually-dominant and information-rich portions signal, which may also be parts signal with better chance withstand adverse acoustical conditions. describe some theory, along preliminary experiments....

10.1109/icassp.1995.479605 article EN International Conference on Acoustics, Speech, and Signal Processing 2002-11-19

In (Morgan et al., 1994), we developed a statistical model of speech recognition where emphasis was placed on the perceptually-relevant and information-rich portion signal. that model, is viewed as sequence elementary decisions or auditory events (avents) are made in response to loci significant spectral change. These decision points interleaved with periods during which insufficient information has been accumulated make next decision. We have called this stochastic perceptual avent SPAM....

10.1109/icslp.1996.607851 article EN 2002-12-24

We present new acoustic confidence scores for utterance verification based on novel combinations of phone-level posterior probability statistics. A common score used in the literature is arithmetic mean (computed over utterance) phone log probabilities. This approach can be problematic when a large part in-grammar (IG), but small out-of-grammar (OOG). For example, caller says an OOG name "Larry" and incorrectly recognized as IG "Harry". Since most phones were correctly recognized, posteriors...

10.1109/icassp.2003.1198848 article EN 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003-11-21

Recommendation system serves as a conduit connecting users to an incredibly large, diverse and ever growing collection of contents. In practice, missing information on fresh (and tail) contents needs be filled in order for them exposed discovered by their audience. We here share our success stories building dedicated content recommendation stack large commercial platform. To nominate contents, we built multi-funnel nomination that combines (i) two-tower model with strong generalization power...

10.1145/3580305.3599826 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

Classical models of speech recognition (both human and machine) assume that a detailed, short-term analysis the signal is essential for accurate decoding spoken language via linear sequence phonetic segments. This classical framework incommensurate with quantitative acoustic/phonetic analyses spontaneous discourse (e.g., Switchboard corpus American English). Such indicate syllable, rather than phone, likely to serve as representational interface between sound meaning, providing relatively...

10.1121/1.425503 article EN The Journal of the Acoustical Society of America 1999-02-01

Recommendation system serves as a conduit connecting users to an incredibly large, diverse and ever growing collection of contents. In practice, missing information on fresh (and tail) contents needs be filled in order for them exposed discovered by their audience. We here share our success stories building dedicated content recommendation stack large commercial platform. To nominate contents, we built multi-funnel nomination that combines (i) two-tower model with strong generalization power...

10.48550/arxiv.2306.01720 preprint EN cc-by arXiv (Cornell University) 2023-01-01

10.21437/icslp.1996-333 article EN 4th International Conference on Spoken Language Processing (ICSLP 1996) 1996-10-03

This paper describes a data-driven technique for optimizing the acoustic models speech recognition systems that target commercial applications over telephones. Frame-averaged foreground log-likelihoods (foreground scores) correlate to errors. These scores are used together with gender optimize data weighting model. process is interpreted as increasing priors and associated parameters poorly modeled data. The score-based optimization leads about 7% fewer semantic errors on live evaluation set...

10.1109/icassp.2004.1326111 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2004-09-28
Coming Soon ...