NFDI4DS | UHH-SEMS - Publication Details

Ming-Hsiang Su

ORCID: 0000-0003-0633-774X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5091093449

Research Areas

Emotion and Mood Recognition
Topic Modeling
Speech and Audio Processing
Natural Language Processing Techniques
Speech and dialogue systems
Advanced Data Compression Techniques
Speech Recognition and Synthesis
Advanced Text Analysis Techniques
Multimedia Communication and Technology
Sentiment Analysis and Opinion Mining
Online and Blended Learning
Music and Audio Processing
Mobile Learning in Education
Video Analysis and Summarization
Recommender Systems and Techniques
Social Robot Interaction and HRI
Text and Document Classification Technologies
Industrial Vision Systems and Defect Detection
Advanced Image and Video Retrieval Techniques
Infant Health and Development
Advanced Neural Network Applications
Peer-to-Peer Network Technologies
Functional Brain Connectivity Studies
AI in Service Interactions
Cloud Computing and Remote Desktop Technologies

Soochow University
2022-2024

Soochow University
2022

National Cheng Kung University
2014-2021

National University of Tainan
2020

National Chung Cheng University
2007-2013

National Pingtung University of Science and Technology
2005

Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds

OPENALEX - Publications

Kun-Yi Huang Chung‐Hsien Wu Qian-Bei Hong Ming-Hsiang Su Yi‐Hsuan Chen

Speech emotion recognition is becoming increasingly important for many applications. In real-life communication, non-verbal sounds within an utterance also play role people to recognize emotion. current studies, only few systems considered nonverbal sounds, such as laughter, cries or other interjection, which naturally exists in our daily conversation. this work, both verbal and were thus of conversations. Firstly, SVM-based verbal/nonverbal sound detector was developed. A Prosodic Phrase...

10.1109/icassp.2019.8682283 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

A chatbot using LSTM-based multi-layer embedding for elderly care

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Kun-Yi Huang Qian-Bei Hong Hsin‐Min Wang

According to demographic changes, the services designed for elderly are becoming more needed than before and increasingly important. In previous work, social media or community-based question-answer data were generally used build chatbot. this study, we collected MHMC chitchat dataset from daily conversations with elderly. Since people free say anything system, sentences converted into patterns in preprocessing part cover variability of conversational sentences. Then, an LSTM-based...

10.1109/icot.2017.8336091 article EN 2017-12-01

A Two-Stage Transformer-Based Approach for Variable-Length Abstractive Summarization

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Hao-Tse Cheng

This study proposes a two-stage method for variable-length abstractive summarization. is an improvement over previous models, in that the proposed approach can simultaneously achieve fluent and The summarization model consists of text segmentation module Transformer-based module. First, utilizes pre-trained Bidirectional Encoder Representations from Transformers (BERT) bidirectional long short-term memory (LSTM) to divide input into segments. An extractive based on BERT-based (BERTSUM) then...

10.1109/taslp.2020.3006731 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2020-01-01

Speech Emotion Recognition Considering Nonverbal Vocalization in Affective Conversations

OPENALEX - Publications

Jia-Hao Hsu Ming-Hsiang Su Chung‐Hsien Wu Yi‐Hsuan Chen

In real-life communication, nonverbal vocalization such as laughter, cries or other emotion interjections, within an utterance play important role for expression. previous studies, only few recognition systems consider vocalization, which naturally exists in our daily conversation. this work, both verbal and sounds are considered of affective conversations. Firstly, a support vector machine (SVM)-based sound detector is developed. A prosodic phrase auto-tagger further employed to extract the...

10.1109/taslp.2021.3076364 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Detecting Unipolar and Bipolar Depressive Disorders from Elicited Speech Responses Using Latent Affective Structure Model

OPENALEX - Publications

Kun-Yi Huang Chung‐Hsien Wu Ming-Hsiang Su Yu‐Ting Kuo

Mood disorders, including unipolar depression (UD) and bipolar disorder (BD) [1] , are reported to be one of the most common mental illnesses in recent years. In diagnostic evaluation on outpatients with mood disorder, a large portion BD patients initially misdiagnosed as having UD [2] . As previous research focused long-term monitoring short-term detection which could used early intervention is thus desirable. This work proposes an approach based patterns emotion elicited speech responses....

10.1109/taffc.2018.2803178 article EN IEEE Transactions on Affective Computing 2018-02-09

Attention-based convolutional neural network and long short-term memory for short-term detection of mood disorders based on elicited speech responses

OPENALEX - Publications

Kun-Yi Huang Chung‐Hsien Wu Ming-Hsiang Su

10.1016/j.patcog.2018.12.016 article EN Pattern Recognition 2018-12-17

LSTM-based Text Emotion Recognition Using Semantic and Emotional Word Vectors

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Kun-Yi Huang Qian-Bei Hong

This study proposes a long-short term memory (LSTM)-based approach to text emotion recognition based on semantic word vector and emotional of the input text. For each in an text, is extracted from 2vec model. Besides, lexical projected all words defined affective lexicon derive vector. An autoencoder then adopted obtain bottleneck features for dimensionality reduction. The are concatenated with form final textual recognition. Finally, given feature sequence entire sentence, LSTM used by...

10.1109/aciiasia.2018.8470378 article EN 2018-05-01

Coupled HMM-based multimodal fusion for mood disorder detection through elicited audio–visual signals

OPENALEX - Publications

Tsung-Hsien Yang Chung‐Hsien Wu Kun-Yi Huang Ming-Hsiang Su

10.1007/s12652-016-0395-y article EN Journal of Ambient Intelligence and Humanized Computing 2016-07-19

Exploiting Turn-Taking Temporal Evolution for Personality Trait Perception in Dyadic Conversations

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Yuting Zheng

In dyadic conversations, turn-taking is a dynamically evolving behavior strongly linked to paralinguistic communication. Turn-taking temporal evolution in conversation inevitable and can be incorporated into modeling framework for characterizing recognizing the personality traits (PTs) of two speakers. This study presents an approach automatically predicting PTs conversation. First, recurrent neural network (RNN) was used model relationship between Big Five Inventory 10 (BFI-10) items...

10.1109/taslp.2016.2531286 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2016-02-19

Cell-Coupled Long Short-Term Memory With $L$ -Skip Fusion Mechanism for Mood Disorder Detection Through Elicited Audiovisual Features

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Kun-Yi Huang Tsung‐Hsien Yang

In early stages, patients with bipolar disorder are often diagnosed as having unipolar depression in mood diagnosis. Because the long-term monitoring is limited by delayed detection of disorder, an accurate and one-time diagnosis desirable to avoid delay appropriate treatment due misdiagnosis. this paper, elicitation-based approach proposed for realizing a using responses elicited from them watch six emotion-eliciting videos. After watching each video clip, conversations, including patient...

10.1109/tnnls.2019.2899884 article EN IEEE Transactions on Neural Networks and Learning Systems 2019-03-18

Speech Emotion Recognition using Convolutional Neural Network with Audio Word-based Embedding

OPENALEX - Publications

Kun-Yi Huang Chung‐Hsien Wu Qian-Bei Hong Ming-Hsiang Su Yuan-Rong Zeng

A complete emotional expression typically contains a complex temporal course in natural conversation. Related research on utterance-level, segment-level and multi-level processing lacks understanding of the underlying relation speech. In this work, convolutional neural network (CNN) with audio word-based embedding is proposed for emotion modeling. study, vector quantization first applied to convert low level features each speech frame into words using k-means algorithm. Word2vec adopted an...

10.1109/iscslp.2018.8706610 article EN 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2018-11-01

Speech emotion recognition using autoencoder bottleneck features and LSTM

OPENALEX - Publications

Kun-Yi Huang Chung‐Hsien Wu Tsung‐Hsien Yang Ming-Hsiang Su Jiahui Chou

A complete emotional expression contains a complex temporal course in conversation. Related research on utterance and segment-level processing lacks considering subtle differences characteristics historical information. In this work, as Deep Scattering Spectrum (DSS) can obtain more detailed energy distributions frequency domain than the Low Level Descriptors (LLDs), work combines LLDs DSS speech features. Autoencoder neural network is then applied to extract bottleneck features for...

10.1109/icot.2016.8278965 article EN 2016-12-01

Attention-Based Response Generation Using Parallel Double Q-Learning for Dialog Policy Decision in a Conversational System

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Liangyu Chen

This article proposes an approach to response generation using a Parallel Double Q-learning algorithm for dialog policy decision in conversational system. First, new semantic representation of the user's input sentence is presented by CKIP parser derive dependency sequence sentence. Then, Gated Recurrent Unit-based Autoencoder used obtain turn as well context representation. A with Deep Neural Network (PD-DQN), combining two DQNs parallel contextual and information message, respectively, are...

10.1109/taslp.2019.2949687 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2019-10-25

Few-Shot Image Segmentation Using Generating Mask with Meta-Learning Classifier Weight Transformer Network

OPENALEX - Publications

Jian‐Hong Wang Phuong Thi Le Fong-Ci Jhou Ming-Hsiang Su Kuo-Chen Li and 6 more

With the rapid advancement of modern hardware technology, breakthroughs have been made in many areas artificial intelligence research, leading to direction machine replacement or assistance various fields. However, most deep learning techniques require large amounts training data and are typically applicable a single task objective. Acquiring such datasets can be particularly challenging, especially domains like medical imaging. In field image processing, few-shot segmentation is an area...

10.3390/electronics13132634 article EN Electronics 2024-07-04

Mood detection from daily conversational speech using denoising autoencoder and LSTM

OPENALEX - Publications

Kun-Yi Huang Chung‐Hsien Wu Ming-Hsiang Su Hsiang-Chi Fu

In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it commonly accepted that speech emotion perceived by the listener close to intended conveyed speaker, research has indicated there still remains a mismatch between them. addition, individuals with different personalities have expressions. Based on investigation, in this study, support vector machine (SVM)-based model first developed detect from daily conversational speech....

10.1109/icassp.2017.7953133 article EN 2017-03-01

Follow-Up Question Generation Using Neural Tensor Network-Based Domain Ontology Population in an Interview Coaching System

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Yi Chang

10.21437/interspeech.2019-1300 article EN Interspeech 2022 2019-09-13

Follow-up Question Generation Using Pattern-based Seq2seq with a Small Corpus for Interview Coaching

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Kun-Yi Huang Qian-Bei Hong Huai-Hung Huang

10.21437/interspeech.2018-1007 article EN Interspeech 2022 2018-08-28

Miscommunication handling in spoken dialog systems based on error-aware dialog state detection

OPENALEX - Publications

Chung‐Hsien Wu Ming-Hsiang Su Wei-Bin Liang

With the exponential growth in computing power and progress speech recognition technology, spoken dialog systems (SDSs) with which a user interacts through natural has been widely used human-computer interaction. However, error-prone automatic (ASR) results usually lead to inappropriate semantic interpretation so that miscommunication happens easily. This paper presents an approach error-aware state (DS) detection for robust handling SDS. Non-understanding (Non-U) misunderstanding (Mis-U)...

10.1186/s13636-017-0107-3 article EN cc-by EURASIP Journal on Audio Speech and Music Processing 2017-05-08

Personality trait perception from speech signals using multiresolution analysis and convolutional neural networks

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Kun-Yi Huang Qian-Bei Hong Hsin‐Min Wang

This study presents an approach to personality trait (PT) perception from speech signals using wavelet-based multiresolution analysis and convolutional neural networks (CNNs). In this study, first, wavelet transform is employed decompose the into at different levels of resolution. Then, acoustic features each resolution are extracted. Given features, CNN adopted generate profiles Big Five Inventory-10 (BFI- 10), which provide a quantitative measure for expressing degree presence or absence...

10.1109/apsipa.2017.8282287 article EN 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2017-12-01

Detection of mood disorder using speech emotion profiles and LSTM

OPENALEX - Publications

Tsung‐Hsien Yang Chung‐Hsien Wu Kun-Yi Huang Ming-Hsiang Su

In mood disorder diagnosis, bipolar (BD) patients are often misdiagnosed as unipolar depression (UD) on initial presentation. It is crucial to establish an accurate distinction between BD and UD make a correct early leading improvements in treatment course of illness. To deal with this misdiagnosis problem, study, we experimented eliciting subjects' emotions by watching six emotional video clips. After each clips, their speech responses were collected when they interviewing clinician....

10.1109/iscslp.2016.7918439 article EN 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2016-10-01

Exploring Macroscopic and Microscopic Fluctuations of Elicited Facial Expressions for Mood Disorder Classification

OPENALEX - Publications

Qian-Bei Hong Chung‐Hsien Wu Ming-Hsiang Su Chia-Cheng Chang

In the clinical diagnosis of mood disorder, a large proportion patients with bipolar disorder (BD) are misdiagnosed as having unipolar depression (UD). Generally, long-term tracking is required for BD to conduct an appropriate by using traditional tools. A one-time system facilitating procedures thus highly desirable. Accordingly, in this study, facial expressions BD, UD, and healthy controls elicited emotional video clips were used conducting classification; classification was performed...

10.1109/taffc.2019.2909873 article EN IEEE Transactions on Affective Computing 2020-04-22

Detection of mood disorder using modulation spectrum of facial action unit profiles

OPENALEX - Publications

Tsung‐Hsien Yang Chung‐Hsien Wu Ming-Hsiang Su Chia-Cheng Chang

In mood disorder diagnosis, bipolar (BD) patients are often misdiagnosed as unipolar depression (UD) on initial presentation. It is crucial to establish an accurate distinction between BD and UD make early leading improvements in treatment. this work, facial expressions of the subjects collected when they were watching eliciting emotional video clips. detection, first, features extracted from DISFA database used train a support vector machine (SVM) for generating action unit (AU) profiles....

10.1109/icot.2016.8278966 article EN 2016-12-01

Exploring microscopic fluctuation of facial expression for mood disorder classification

OPENALEX - Publications

Ming-Hsiang Su Chung‐Hsien Wu Kun-Yi Huang Qian-Bei Hong Hsin‐Min Wang

In clinical diagnosis of mood disorder, depression is one the most common psychiatric disorders. There are two major types disorders: depressive disorder (MDD) and bipolar (BPD). A large portion BPD misdiagnosed as MDD in diagnostic Short-term detection which could be used early intervention thus desirable. This study investigates microscopic facial expression changes for subjects with MDD, control group (CG), when elicited by emotional video clips. uses eight basic orientations motion...

10.1109/icot.2017.8336090 article EN 2017-12-01

BERT-based Chinese Medicine Named Entity Recognition Model Applied to Medication Reminder Dialogue System

OPENALEX - Publications

Tsung‐Hsien Yang Matúš Pleva Daniel Hládek Ming-Hsiang Su

The general public in Taiwan generally believes that traditional Chinese medicine (TCM) is mild and has no side effects, but they ignore the safety of medicine. If name disease can be correctly identified human-machine dialogue, it help dialogue system to give correct medication reminders. In this study, a named entity recognition was constructed applied identification names names, results could further used human-computer provide people with First, study uses web crawler organize network...

10.1109/iscslp57327.2022.10037867 article EN 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2022-12-11

Coming Soon ...