- Autism Spectrum Disorder Research
- Speech Recognition and Synthesis
- Speech and Audio Processing
- Language Development and Disorders
- Emotion and Mood Recognition
- Phonetics and Phonology Research
- Child Development and Digital Technology
- Voice and Speech Disorders
- Music and Audio Processing
- Behavioral and Psychological Studies
- Child and Adolescent Psychosocial and Emotional Development
- Stuttering Research and Treatment
- Infant Health and Development
- Attachment and Relationship Dynamics
- Genetics and Neurodevelopmental Disorders
- Speech and dialogue systems
- Functional Brain Connectivity Studies
- Virology and Viral Diseases
- Mental Health Research Topics
- RNA regulation and disease
- Williams Syndrome Research
- Evolutionary Psychology and Human Behavior
- Multisensory perception and integration
- Animal Vocal Communication and Behavior
- Child and Animal Learning Development
Amazon (United States)
2021
University of Southern California
2010-2020
Southern California University for Professional Studies
2010-2017
Google (United States)
2014
Background Machine learning (ML) provides novel opportunities for human behavior research and clinical translation, yet its application can have noted pitfalls (Bone et al., 2015). In this work, we fastidiously utilize ML to derive autism spectrum disorder (ASD) instrument algorithms in an attempt improve upon widely used ASD screening diagnostic tools. Methods The data consisted of Autism Diagnostic Interview‐Revised (ADI‐R) Social Responsiveness Scale (SRS) scores 1,264 verbal individuals...
The purpose of this study was to examine relationships between prosodic speech cues and autism spectrum disorder (ASD) severity, hypothesizing a mutually interactive relationship the characteristics psychologist child. authors objectively quantified acoustic-prosodic child with ASD during spontaneous interaction, establishing methodology for future large-sample analysis.Speech features were semiautomatically derived from segments semistructured interviews (Autism Diagnostic Observation...
Formally, the problem that we present is of identifying hidden attributes system modulates body's signals, uncovered through novel signal processing and machine learning on large-scale multimodal data (Figure 1). Signal keystone supports this mapping from to representations behaviors mental states. The pipeline first begins with raw such as visual, auditory, physiological sensors. Then, need localize information coming corresponding behavioral channels, face, body, voice. Next, signals are...
Individuals with serious mental illness experience changes in their clinical states over time that are difficult to assess and result increased disease burden care utilization. It is not known if features derived from speech can serve as a transdiagnostic marker of these states. This study evaluates the feasibility collecting samples people explores potential utility for tracking state time. Patients (n = 47) were recruited community-based health clinic diagnoses bipolar disorder, major...
Speech emotion recognition (SER) is a key technology to enable more natural human-machine communication. However, SER has long suffered from lack of public large-scale labeled datasets. To circumvent this problem, we investigate how unsupervised representation learning on unlabeled datasets can benefit SER. We show that the contrastive predictive coding (CPC) method learn salient representations datasets, which improves performance. In our experiments, achieved state-of-the-art concordance...
Studies in classifying affect from vocal cues have produced exceptional within-corpus results, especially for arousal (activation or stress); yet cross-corpora recognition has only recently garnered attention. An essential requirement of many behavioral studies is scoring that generalizes across different social contexts and data conditions. We present a robust, unsupervised (rule-based) method providing scale-continuous, bounded rating operating on the signal. The incorporates just three...
Atypical prosody, often reported in children with Autism Spectrum Disorders, is described by a range of qualitative terms that reflect the eccentricities and variability among persons spectrum. We investigate various wordand phonetic-level features from spontaneous speech may quantify cues reflecting prosody. Furthermore, we introduce importance jointly modeling psychologist’s vocal behavior this dyadic interaction. demonstrate acoustic-prosodic both participants correlate children’s rated...
Empathy measures the capacity of therapist to experience same cognitive and emotional dispositions as patient, is a key quality factor in counseling. In this work we build computational models infer empathy using prosodic cues. We extract pitch, energy, jitter, shimmer utterance duration from speech signal, normalize quantize these features order estimate distribution certain patterns during each interaction. find significant correlation between patterns, achieve 75% accuracy classifying...
A method of rapid semi-automatic segmentation real-time magnetic resonance image data for parametric analysis vocal tract shaping is described. Tissue boundaries are identified by seeking pixel intensity thresholds along tract-normal gridlines. Airway contours constrained with respect to a centerline defined as an optimal path over the graph all minima between glottis and lips. The allows superimposition reference guide automatic anatomical features which poorly imaged using ‐ dentition hard...
The study of speech pathology involves evaluation and treatment production related disorders affecting phonation, fluency, intonation aeromechanical components respiration. Recently, has garnered special interest amongst machine learning signal processing (ML-SP) scientists. This growth in is led by advances novel data collection technology, science, computational modeling. These turn have enabled scientists better understanding both the causes effects pathological conditions. In this paper,...
Speaker state recognition is a challenging problem due to speaker and context variability. Intoxication detection an important area of paralinguistic speech research with potential real-world applications. In this work, we build upon base set various static acoustic features by proposing the combination several different methods for learning task. The include extracting hierarchical features, performing iterative normalization, using GMM supervectors. We obtain optimal unweighted recall...
Atypical speech prosody is a primary characteristic of autism spectrum disorders (ASD), yet it often excluded from diagnostic instrument algorithms due to poor subjective reliability. Robust, objective prosodic cues can enhance our understanding those aspects which are atypical in autism. In this work, we connect signal-derived descriptors perceptions awkwardness. Subjectively, more awkward less expressive (more monotone) and has perceived rate/rhythm, volume, intonation. We also find...
The need for reliable, scalable and efficient diagnosis of Parkinson’s Disease (PD) is a major clinical need. Automating the can lead to more accurate objective predictions as well provide insights regarding nature condition. This paper proposes fully automated system rate severity (UPDRS-III scale) PD from patients’ speech. Specifically, captures atypicalities in an individual’s voice when performing multiple diverse speaking tasks makes unified prediction severity. performance tested...
Impaired social communication and reciprocity are the primary phenotypic distinctions between autism spectrum disorders (ASD) other developmental disorders. We investigate quantitative conversational cues in child-psychologist interactions using acoustic-prosodic, turn-taking, language features. Results indicate quality degraded for children with higher ASD severity, as child exhibited difficulties conversing psychologist varied her speech strategies to engage child. When interacting...
We introduce the USC CARE Corpus, comprised of spontaneous and standardized child-psychologist interactions children with a diagnosis an autism spectrum disorder (ASD). The audio-video data is collected in context Autism Diagnostic Observation Schedule (ADOS), which tool used by psychologists for research-level ASD children. interaction consists developmentally appropriate semi-structured social activities, providing psychologist sample behavior to rate child on series autism-relevant...
This paper presents an automatic speaker state recognition approach which models the factor vectors in latent analysis framework improving upon Gaussian Mixture Model (GMM) baseline performance. We investigate both intoxicated and affective states. consider speech signal as original normal average being corrupted by channel effects. Rather than reducing variability to enhance robustness verification task, we directly model on factors under framework. In this work, are extracted modeled GMM...
Speech and spoken language cues offer a valuable means to measure model human behavior. Computational models of speech behavior have the potential support health care through assistive technologies, informed intervention, efficient long-term monitoring. The Interspeech 2013 Autism SubChallenge addresses two developmental disorders that manifest in speech: autism spectrum specific impairment. We present classification results with an analysis on development set including discussion confounds...
Automatically evaluating pronunciation quality of non-native speech has seen tremendous success in both research and commercial settings, with applications L2 learning. In this paper, submitted for the INTERSPEECH 2015 Degree Nativeness Sub-Challenge, problem is posed under a challenging crosscorpora setting using data drawn from multiple speakers variety language backgrounds (L1) reading different English sentences. Since perception non-nativeness realized at segmental suprasegmental...
This paper presents an unsupervised method for producing a bounded rating of affective arousal from speech. One the major challenges in such behavioral signal classification is design methods that generalize well across domains and datasets. We propose framework provides robustness databases by: selecting coherent features based on empirical theoretical evidence, fusing activation confidences multiple features, effectively weighting soft-labels without knowing true labels. Spearman’s...
Signal-derived measures can provide effective ways towards quantifying human behavior. Verbal Response Latencies (VRLs) of children with Autism Spectrum Disorders (ASD) during conversational interactions are able to convey valuable information about their cognitive and social skills. Motivated by the inherent gap between external behavior inner affective state ASD, we study VRLs in relation explicit but also implicit behavioral cues. Explicit cues include children's language use, while based...