- Emotion and Mood Recognition
- Topic Modeling
- Speech and Audio Processing
- Natural Language Processing Techniques
- Speech and dialogue systems
- Advanced Data Compression Techniques
- Speech Recognition and Synthesis
- Advanced Text Analysis Techniques
- Multimedia Communication and Technology
- Sentiment Analysis and Opinion Mining
- Online and Blended Learning
- Music and Audio Processing
- Mobile Learning in Education
- Video Analysis and Summarization
- Recommender Systems and Techniques
- Social Robot Interaction and HRI
- Text and Document Classification Technologies
- Industrial Vision Systems and Defect Detection
- Advanced Image and Video Retrieval Techniques
- Infant Health and Development
- Advanced Neural Network Applications
- Peer-to-Peer Network Technologies
- Functional Brain Connectivity Studies
- AI in Service Interactions
- Cloud Computing and Remote Desktop Technologies
Soochow University
2022-2024
Soochow University
2022
National Cheng Kung University
2014-2021
National University of Tainan
2020
National Chung Cheng University
2007-2013
National Pingtung University of Science and Technology
2005
Speech emotion recognition is becoming increasingly important for many applications. In real-life communication, non-verbal sounds within an utterance also play role people to recognize emotion. current studies, only few systems considered nonverbal sounds, such as laughter, cries or other interjection, which naturally exists in our daily conversation. this work, both verbal and were thus of conversations. Firstly, SVM-based verbal/nonverbal sound detector was developed. A Prosodic Phrase...
According to demographic changes, the services designed for elderly are becoming more needed than before and increasingly important. In previous work, social media or community-based question-answer data were generally used build chatbot. this study, we collected MHMC chitchat dataset from daily conversations with elderly. Since people free say anything system, sentences converted into patterns in preprocessing part cover variability of conversational sentences. Then, an LSTM-based...
This study proposes a two-stage method for variable-length abstractive summarization. is an improvement over previous models, in that the proposed approach can simultaneously achieve fluent and The summarization model consists of text segmentation module Transformer-based module. First, utilizes pre-trained Bidirectional Encoder Representations from Transformers (BERT) bidirectional long short-term memory (LSTM) to divide input into segments. An extractive based on BERT-based (BERTSUM) then...
In real-life communication, nonverbal vocalization such as laughter, cries or other emotion interjections, within an utterance play important role for expression. previous studies, only few recognition systems consider vocalization, which naturally exists in our daily conversation. this work, both verbal and sounds are considered of affective conversations. Firstly, a support vector machine (SVM)-based sound detector is developed. A prosodic phrase auto-tagger further employed to extract the...
Mood disorders, including unipolar depression (UD) and bipolar disorder (BD) [1] , are reported to be one of the most common mental illnesses in recent years. In diagnostic evaluation on outpatients with mood disorder, a large portion BD patients initially misdiagnosed as having UD [2] . As previous research focused long-term monitoring short-term detection which could used early intervention is thus desirable. This work proposes an approach based patterns emotion elicited speech responses....
This study proposes a long-short term memory (LSTM)-based approach to text emotion recognition based on semantic word vector and emotional of the input text. For each in an text, is extracted from 2vec model. Besides, lexical projected all words defined affective lexicon derive vector. An autoencoder then adopted obtain bottleneck features for dimensionality reduction. The are concatenated with form final textual recognition. Finally, given feature sequence entire sentence, LSTM used by...
In dyadic conversations, turn-taking is a dynamically evolving behavior strongly linked to paralinguistic communication. Turn-taking temporal evolution in conversation inevitable and can be incorporated into modeling framework for characterizing recognizing the personality traits (PTs) of two speakers. This study presents an approach automatically predicting PTs conversation. First, recurrent neural network (RNN) was used model relationship between Big Five Inventory 10 (BFI-10) items...
In early stages, patients with bipolar disorder are often diagnosed as having unipolar depression in mood diagnosis. Because the long-term monitoring is limited by delayed detection of disorder, an accurate and one-time diagnosis desirable to avoid delay appropriate treatment due misdiagnosis. this paper, elicitation-based approach proposed for realizing a using responses elicited from them watch six emotion-eliciting videos. After watching each video clip, conversations, including patient...
A complete emotional expression typically contains a complex temporal course in natural conversation. Related research on utterance-level, segment-level and multi-level processing lacks understanding of the underlying relation speech. In this work, convolutional neural network (CNN) with audio word-based embedding is proposed for emotion modeling. study, vector quantization first applied to convert low level features each speech frame into words using k-means algorithm. Word2vec adopted an...
A complete emotional expression contains a complex temporal course in conversation. Related research on utterance and segment-level processing lacks considering subtle differences characteristics historical information. In this work, as Deep Scattering Spectrum (DSS) can obtain more detailed energy distributions frequency domain than the Low Level Descriptors (LLDs), work combines LLDs DSS speech features. Autoencoder neural network is then applied to extract bottleneck features for...
This article proposes an approach to response generation using a Parallel Double Q-learning algorithm for dialog policy decision in conversational system. First, new semantic representation of the user's input sentence is presented by CKIP parser derive dependency sequence sentence. Then, Gated Recurrent Unit-based Autoencoder used obtain turn as well context representation. A with Deep Neural Network (PD-DQN), combining two DQNs parallel contextual and information message, respectively, are...
With the rapid advancement of modern hardware technology, breakthroughs have been made in many areas artificial intelligence research, leading to direction machine replacement or assistance various fields. However, most deep learning techniques require large amounts training data and are typically applicable a single task objective. Acquiring such datasets can be particularly challenging, especially domains like medical imaging. In field image processing, few-shot segmentation is an area...
In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it commonly accepted that speech emotion perceived by the listener close to intended conveyed speaker, research has indicated there still remains a mismatch between them. addition, individuals with different personalities have expressions. Based on investigation, in this study, support vector machine (SVM)-based model first developed detect from daily conversational speech....
With the exponential growth in computing power and progress speech recognition technology, spoken dialog systems (SDSs) with which a user interacts through natural has been widely used human-computer interaction. However, error-prone automatic (ASR) results usually lead to inappropriate semantic interpretation so that miscommunication happens easily. This paper presents an approach error-aware state (DS) detection for robust handling SDS. Non-understanding (Non-U) misunderstanding (Mis-U)...
This study presents an approach to personality trait (PT) perception from speech signals using wavelet-based multiresolution analysis and convolutional neural networks (CNNs). In this study, first, wavelet transform is employed decompose the into at different levels of resolution. Then, acoustic features each resolution are extracted. Given features, CNN adopted generate profiles Big Five Inventory-10 (BFI- 10), which provide a quantitative measure for expressing degree presence or absence...
In mood disorder diagnosis, bipolar (BD) patients are often misdiagnosed as unipolar depression (UD) on initial presentation. It is crucial to establish an accurate distinction between BD and UD make a correct early leading improvements in treatment course of illness. To deal with this misdiagnosis problem, study, we experimented eliciting subjects' emotions by watching six emotional video clips. After each clips, their speech responses were collected when they interviewing clinician....
In the clinical diagnosis of mood disorder, a large proportion patients with bipolar disorder (BD) are misdiagnosed as having unipolar depression (UD). Generally, long-term tracking is required for BD to conduct an appropriate by using traditional tools. A one-time system facilitating procedures thus highly desirable. Accordingly, in this study, facial expressions BD, UD, and healthy controls elicited emotional video clips were used conducting classification; classification was performed...
In mood disorder diagnosis, bipolar (BD) patients are often misdiagnosed as unipolar depression (UD) on initial presentation. It is crucial to establish an accurate distinction between BD and UD make early leading improvements in treatment. this work, facial expressions of the subjects collected when they were watching eliciting emotional video clips. detection, first, features extracted from DISFA database used train a support vector machine (SVM) for generating action unit (AU) profiles....
In clinical diagnosis of mood disorder, depression is one the most common psychiatric disorders. There are two major types disorders: depressive disorder (MDD) and bipolar (BPD). A large portion BPD misdiagnosed as MDD in diagnostic Short-term detection which could be used early intervention thus desirable. This study investigates microscopic facial expression changes for subjects with MDD, control group (CG), when elicited by emotional video clips. uses eight basic orientations motion...
The general public in Taiwan generally believes that traditional Chinese medicine (TCM) is mild and has no side effects, but they ignore the safety of medicine. If name disease can be correctly identified human-machine dialogue, it help dialogue system to give correct medication reminders. In this study, a named entity recognition was constructed applied identification names names, results could further used human-computer provide people with First, study uses web crawler organize network...