- Emotion and Mood Recognition
- Speech Recognition and Synthesis
- Speech and Audio Processing
- Topic Modeling
- Social Robot Interaction and HRI
- Sentiment Analysis and Opinion Mining
- Speech and dialogue systems
- Music and Audio Processing
- Natural Language Processing Techniques
- Voice and Speech Disorders
- AI in Service Interactions
- Mental Health via Writing
- Robotics and Automated Systems
- Diabetic Foot Ulcer Assessment and Management
- Brain Tumor Detection and Classification
- Functional Brain Connectivity Studies
- EEG and Brain-Computer Interfaces
- Advanced Computing and Algorithms
- Parkinson's Disease Mechanisms and Treatments
- Muscle activation and electromyography studies
- Gaze Tracking and Assistive Technology
- Artificial Intelligence in Education
- Digital Mental Health Interventions
- Human-Animal Interaction Studies
- Multi-Agent Systems and Negotiation
Northeastern University
2022-2025
Osaka University
2020-2025
Qinhuangdao Science and Technology Bureau
2025
Northeastern University
2024
Advanced Telecommunications Research Institute International
2020-2021
Emotion recognition in conversation (ERC) is an important research direction the field of human-computer interaction (HCI), which recognizes emotions by analyzing utterance signals to enhance user experience and plays role several domains. However, existing on ERC mainly focuses constructing graph networks directly modeling interactions multimodal fused features, cannot adequately capture complex dialog dependency based time, speaker, modalities, etc. In addition, multi-task learning...
Depression can be reflected by long-term human spatio-temporal facial behaviours. While face videos recorded in real-world usually have long and variable lengths, existing video-based depression assessment approaches frequently re-sample/down-sample such to short equal-length videos, or split each video into several segments, where segment-level behaviours are suppressed as a vector-style representations for RNN-based (video-level) modelling. Both strategies lead crucial information loss...
Abstract Many social robots have emerged in public places to serve people. For these services, the are assumed be able present internal aspects (i.e., mind, sociability) engage and interact with people over long term. In this paper, we propose a novel dialogue structure called experience-based help robot maintain good interaction This contains piece of knowledge story about how gained knowledge, which used compose robot’s experience-related utterances for sharing experiences interacting...
Mental health issues are receiving more and attention in society. In this paper, we introduce a preliminary study on human-robot mental comforting conversation, to make an android robot (ERICA) present understanding of the user's situation by sharing similar emotional experiences enhance perception empathy. Specifically, create speech for ERICA using CycleGAN-based voice conversion model, which pitch spectrogram converted according state. Then, design dialogue scenarios user talk about...
Tracking and segmenting small targets in remote sensing videos on edge devices carries significant engineering implications. However, many semi-supervised video object segmentation (S-VOS) methods heavily rely extensive random-access memory (VRAM) resources, making deployment challenging. Our goal is to develop an edge-deployable S-VOS method that can achieve high-precision tracking by selecting a bounding box for the target object. First, tracker introduced pinpoint position of tracked...
Social connectedness is vital for developing group cohesion and strengthening belongingness. However, with the accelerating pace of modern life, people have fewer opportunities to participate in group-building activities. Furthermore, owing teleworking quarantine requirements necessitated by Covid-19 pandemic, social members may become weak. To address this issue, study, we used an android robot conduct daily conversations, as intermediary increase intra-group connectedness. Specifically,...
Automatic depression analysis has been widely investigated on face videos that have carefully collected and annotated in lab conditions. However, under real-world conditions may suffer from various types of noise due to challenging data acquisition lack annotators. Although deep learning (DL) models frequently show excellent performances datasets controlled conditions, such degrade their generalization abilities for tasks. In this article, we uncovered noisy facial annotations consistently...
In this study, we try to recognize the similarities between different languages in expressing basic human emotions by cross/multi-language corpus training of a novel recognition model based on one dimesional convolutional neural network (CNN) and bi-directional long short-term memory (bi-LSTM) with attention mechanism, named it CAbiLS. We train test using various combinations three corpora (German, Chinese Italian) also discuss further improvements be made speech emotion (SER) systems order...
Some robots for human-robot interaction are designed with female or male physical appearance. Other endowed no gender characteristics, namely genderless robots, such as Pepper and NAO robot. A robot appearance should possess the mapped speech style during a natural interaction, which can be learned from humans' speech. In this paper, we make new trial to synthesis gender-free speeches physically is promising in order improve more robots. Our style-controlled synthesizer takes text embedding...
Emotion recognition has been gaining attention in recent years due to its applications on artificial agents. To achieve a good performance with this task, much research conducted the multi-modality emotion model for leveraging different strengths of each modality. However, question remains: what exactly is most appropriate way fuse information from modalities? In paper, we proposed audio sample augmentation and an emotion-oriented encoder-decoder improve discussed inter-modality,...
With the increasing volume of healthcare data, automated International Classification Diseases (ICD) has become increasingly relevant and is frequently regarded as a medical multi-label prediction problem. Current methods struggle to accurately classify diagnosis texts that represent deep sparse categories. Unlike these works model label with code hierarchy or description for prediction, we argue generation structural information can provide more comprehensive knowledge based on observation...