- Emotion and Mood Recognition
- Music and Audio Processing
- Speech and Audio Processing
- COVID-19 diagnosis using AI
- Sentiment Analysis and Opinion Mining
- Speech Recognition and Synthesis
- Time Series Analysis and Forecasting
- Phonocardiography and Auscultation Techniques
- Non-Invasive Vital Sign Monitoring
- Anomaly Detection Techniques and Applications
- Speech and dialogue systems
- Obstructive Sleep Apnea Research
- Advanced Text Analysis Techniques
- Machine Learning in Healthcare
- Mental Health via Writing
- Color perception and design
- Heart Rate Variability and Autonomic Control
- ECG Monitoring and Analysis
- Natural Language Processing Techniques
- Context-Aware Activity Recognition Systems
- Data Quality and Management
- Psychological Well-being and Life Satisfaction
- Neural Networks and Applications
- Data-Driven Disease Surveillance
- Artificial Intelligence in Healthcare
The University of Melbourne
2024-2025
University of Cambridge
2021-2024
Nokia (United Kingdom)
2023-2024
UNSW Sydney
2015-2024
Sichuan Normal University
2022
Shihezi University
2019
Commonwealth Scientific and Industrial Research Organisation
2018
Data61
2018
Abstract To identify Coronavirus disease (COVID-19) cases efficiently, affordably, and at scale, recent work has shown how audio (including cough, breathing voice) based approaches can be used for testing. However, there is a lack of exploration biases methodological decisions impact these tools’ performance in practice. In this paper, we explore the realistic audio-based digital testing COVID-19. investigate this, collected large crowdsourced respiratory dataset through mobile app,...
Continuous emotion dimension prediction has increased in popularity over the last few years, as shift away from discrete classification based tasks introduced more realism modeling. However, many questions remain including how best to combine information several modalities (e.g. audio, video, etc). As part of AV+EC 2015 Challenge, we investigate annotation delay compensation and propose a range multimodal systems on an output-associative fusion framework. The performance proposed are...
With the soaring adoption of in-ear wearables, research community has started investigating suitable heart rate (HR) detection systems. HR is a key physiological marker cardiovascular health and physical fitness. Continuous reliable monitoring with wearable devices therefore gained increasing attention in recent years. Existing systems wearables mainly rely on photoplethysmography (PPG) sensors, however, these are notorious for poor performance presence human motion. In this work, leveraging...
Speech Emotion Recognition (SER) is crucial in human-machine interactions. Mainstream approaches utilize Convolutional Neural Networks or Recurrent to learn local energy feature representations of speech segments from information, but struggle with capturing global information such as the duration speech. Some use Transformers capture there room for improvement terms parameter count and performance. Furthermore, existing attention mechanisms focus on spatial channel dimensions, hindering...
With the soaring adoption of in-ear wearables, research community has started investigating suitable heart rate detection systems. Heart is a key physiological marker cardiovascular health and physical fitness. Continuous reliable monitoring with wearable devices therefore gained increasing attention in recent years. Existing systems wearables mainly rely on photoplethysmography (PPG) sensors, however, these are notorious for poor performance presence human motion. In this work, leveraging...
Machine Learning models typically assume that time series are regularly spaced, however this is often unrealistic in healthcare, where missing data recordings common. In context, uncertainty estimates play a pivotal role, as they can enable confident and non-confident predictions to be distinguished. We propose SQUIREDL, novel uncertainty-aware sequence-to-sequence prediction method for sparse healthcare series. Specifically, we enhance the state-of-the-art evidential regression framework,...
Recent work has shown the potential of using audio data (eg, cough, breathing, and voice) in screening for COVID-19. However, these approaches only focus on one-off detection detect infection, given current sample, but do not monitor disease progression Limited exploration been put forward to continuously COVID-19 progression, especially recovery, through longitudinal data. Tracking characteristics patterns recovery could bring insights lead more timely treatment or adjustment, as well...
Predicting emotion intensity and severity of depression are both challenging important problems within the broader field affective computing. As part AVEC 2017, we developed a number systems to accomplish these tasks. In particular, word affect features, which derive human ratings (e.g. arousal valence) from transcripts, were investigated for predicting liking, showing great promise. A simple system based on features achieved an RMSE 6.02 test set, yielding relative improvement 13.6% over...
Integrating physiological signals such as electroencephalogram (EEG), with other data interview audio, may offer valuable multimodal insights into psychological states or neurological disorders.Recent advancements Large Language Models (LLMs) position them prospective "health agents" for mental health assessment.However, current research predominantly focus on single modalities, presenting an opportunity to advance understanding through data.Our study aims this approach by investigating...
Within the field of affective computing, human emotion and disorder/disease recognition have progressively attracted more interest in multimodal analysis. This submission to Depression Classification Continuous Emotion Prediction challenges for AVEC2016 investigates both, with a focus on audio subsystems. For depression classification, we investigate token word selection, vocal tract coordination parameters computed from spectral centroid features, gender-dependent classification systems....
Recently, sound-based COVID-19 detection studies have shown great promise to achieve scalable and prompt digital prescreening.However, there are still two unsolved issues hindering the practice.First, collected datasets for model training often imbalanced, with a considerably smaller proportion of users tested positive, making it harder learn representative robust features.Second, deep learning models generally overconfident in their predictions.Clinically, false predictions aggravate...
In the continuous quest to push boundaries of mobile healthcare and fitness tracking, monitoring respiratory biomarkers emerges as a pivotal frontier. this paper, we present OptiBreathe, lightweight on-device earable system designed decode modulations within photoplethysmography (PPG) signals. OptiBreathe computes three clinical towards enabling health with wearable devices. our effort bridge research computing, collected first-of-its-kind dataset that empowers researchers explore in-ear PPG...
Healthcare monitoring is crucial for early detection, timely intervention, and the ongoing management of health conditions, ultimately improving individuals' quality life. Recent research shows that Large Language Models (LLMs) have demonstrated impressive performance in supporting healthcare tasks. However, existing LLM-based solutions typically rely on cloud-based systems, which raise privacy concerns increase risk personal information leakage. As a result, there growing interest running...
The widespread adoption of wireless earbuds has advanced the developments in earable-based sensing various domains like entertainment, human-computer interaction, and health monitoring. Recently, researchers have shown an increased interest user authentication using earables. Despite successes witnessed acoustic probing speech based systems, this paper proposed a lightweight non-invasive ambient sound scheme. It employs difference between in-ear out-ear sounds to estimate individual-specific...
Uncertainty quantification is critical for ensuring the safety of deep learning-enabled health diagnostics, as it helps model account unknown factors and reduces risk misdiagnosis. However, existing uncertainty studies often overlook significant issue class imbalance, which common in medical data. In this paper, we propose a class-balanced evidential learning framework to achieve fair reliable estimates diagnostic models. This advances state-of-the-art method with two novel mechanisms...
Predicting continuous emotion in terms of affective attributes has mainly been focused on hard labels, which ignored the ambiguity recognizing certain emotions. This may result high inter-rater variability and turn causes varying prediction uncertainty with time. Based assumption that temporal dependencies occur evolution uncertainty, this paper proposes a dynamic multi-rater Gaussian Mixture Regression (GMR), aiming to obtain reflected by multi-raters taking into account their dependencies....
Time series forecasting, as one of the fundamental machine learning areas, has attracted tremendous attentions over recent years. The solutions have evolved from statistical (ML) methods to deep techniques. One emerging sub-field time forecasting is individual disease progression e.g., predicting individuals' development a few days (e.g., deteriorating trends, recovery speed) based on past observations. Despite promises in existing ML techniques, variety unique challenges emerge for such...
Earables (in-ear wearables) are gaining increasing attention for sensing applications and healthcare research thanks to their ergonomy non-invasive nature. However, air leakages between the device user's ear, resulting from daily activities or wearing variabilities, can decrease performance of applications, interfere with calibrations, reduce robustness overall system. Existing literature lacks established methods estimating degree leaks (i.e., seal integrity) provide information earable...
Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it necessary to preserve order between the input and target sequences. However, CTC only applied deterministic models, latent space discontinuous sparse, which in turn makes them less capable of handling data variability when compared variational models. In this paper, we integrate with a model derive loss functions that can be used train more generalizable models order....