- Speech Recognition and Synthesis
- Speech and Audio Processing
- Natural Language Processing Techniques
- Speech and dialogue systems
- Topic Modeling
- Emotion and Mood Recognition
- Music and Audio Processing
- Phonetics and Phonology Research
- Advanced Data Compression Techniques
- Advanced Text Analysis Techniques
- Sentiment Analysis and Opinion Mining
- Hand Gesture Recognition Systems
- Chalcogenide Semiconductor Thin Films
- Quantum Dots Synthesis And Properties
- Face and Expression Recognition
- Face recognition and analysis
- Advanced Image and Video Retrieval Techniques
- Blind Source Separation Techniques
- Neural Networks and Applications
- Image Retrieval and Classification Techniques
- Video Analysis and Summarization
- Semantic Web and Ontologies
- Mental Health Research Topics
- Social Robot Interaction and HRI
- Supercapacitor Materials and Fabrication
National Cheng Kung University
2016-2025
National Cheng Kung University Hospital
2002-2025
Institute of Electrical and Electronics Engineers
2006-2024
National Taipei University of Technology
2022-2024
Academia Sinica
2007-2023
National Yang Ming Chiao Tung University
2004-2022
Kermanshah University of Medical Sciences
2020
University of Nizwa
2020
Normandie Université
2020
Albany State University
2020
This work presents an approach to emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information (AP) and semantic labels (SLs). For AP-based recognition, acoustic prosodic features including spectrum, formant, pitch-related are extracted from the detected emotional salient segments input speech. Three types models, GMMs, SVMs, MLPs, adopted as base-level classifiers. A Meta Decision Tree (MDT) is then employed for classifier fusion obtain...
This study presents a novel approach to automatic emotion recognition from text. First, generation rules (EGRs) are manually deduced psychology represent the conditions for generating emotion. Based on EGRs, emotional state of each sentence can be represented as sequence semantic labels (SLs) and attributes (ATTs); SLs defined domain-independent features, while ATTs domain-dependent. The association (EARs) by automatically derived sentences in an text corpus using priori algorithm. Finally,...
Emotion recognition is the ability to identify what people would think someone feeling from moment and understand connection between his/her feelings expressions. In today's world, human–computer interaction (HCI) interface undoubtedly plays an important role in our daily life. Toward harmonious HCI interface, automated analysis of human emotion has attracted increasing attention researchers multidisciplinary research fields. this paper, a survey on theoretical practical work offering new...
This paper presents an approach to the automatic recognition of human emotions from audio-visual bimodal signals using error weighted semi-coupled hidden Markov model (EWSC-HMM). The proposed combines SC-HMM with a state-based alignment strategy and Bayesian classifier weighting scheme obtain optimal emotion result based on fusion. in is align temporal relation between audio visual streams. then adopted explore contributions SC-HMM-based classifiers for different feature pairs order output....
Speech emotion recognition is becoming increasingly important for many applications. In real-life communication, non-verbal sounds within an utterance also play role people to recognize emotion. current studies, only few systems considered nonverbal sounds, such as laughter, cries or other interjection, which naturally exists in our daily conversation. this work, both verbal and were thus of conversations. Firstly, SVM-based verbal/nonverbal sound detector was developed. A Prosodic Phrase...
Abstract Two-terminal (2-T) perovskite (PVK)/CuIn(Ga)Se 2 (CIGS) tandem solar cells (TSCs) have been considered as an ideal cell because of their best bandgap matching regarding to Shockley–Queisser (S–Q) limits. However, the nature irregular rough morphology commercial CIGS prevents people from improving device performances. In this paper, D-homoserine lactone hydrochloride is proven improve coverage PVK materials on surfaces and also passivate bulk defects by modulating growth crystals....
According to demographic changes, the services designed for elderly are becoming more needed than before and increasingly important. In previous work, social media or community-based question-answer data were generally used build chatbot. this study, we collected MHMC chitchat dataset from daily conversations with elderly. Since people free say anything system, sentences converted into patterns in preprocessing part cover variability of conversational sentences. Then, an LSTM-based...
This study proposes a two-stage method for variable-length abstractive summarization. is an improvement over previous models, in that the proposed approach can simultaneously achieve fluent and The summarization model consists of text segmentation module Transformer-based module. First, utilizes pre-trained Bidirectional Encoder Representations from Transformers (BERT) bidirectional long short-term memory (LSTM) to divide input into segments. An extractive based on BERT-based (BERTSUM) then...
In real-life communication, nonverbal vocalization such as laughter, cries or other emotion interjections, within an utterance play important role for expression. previous studies, only few recognition systems consider vocalization, which naturally exists in our daily conversation. this work, both verbal and sounds are considered of affective conversations. Firstly, a support vector machine (SVM)-based sound detector is developed. A prosodic phrase auto-tagger further employed to extract the...
This paper presents an approach to emotion recognition from speech signals and textual content. In the analysis of signals, thirty-three acoustic features are extracted input. After Principle Component Analysis (PCA) is performed, 14 principle components selected for discriminative representation. this representation, each component combination 33 original forms a feature subspace. Support Vector Machines (SVMs) adopted classify emotional states. text analysis, all keywords modification...
This paper presents an expressive voice conversion model (DeBi-HMM) as the post processing of a text-to-speech (TTS) system for speech synthesis. DeBi-HMM is named its duration-embedded characteristic two HMMs modeling source and target signals, respectively. Joint estimation exploited spectrum from neutral to speech. Gamma distribution embedded duration each state in HMMs. The style-dependent decision trees achieve prosodic conversion. STRAIGHT algorithm adopted analysis synthesis process....
A complete emotional expression typically contains a complex temporal course in face-to-face natural conversation. To address this problem, bimodal hidden Markov model (HMM)-based emotion recognition scheme, constructed terms of sub-emotional states, which are defined to represent phases onset, apex, and offset, is adopted the an for audio visual signal streams. two-level hierarchical alignment mechanism proposed align relationship within between HMM sequences at state levels semi-coupled...
This study proposes a speech enhancement method based on compressive sensing. The main procedures involved in the proposed are performed frequency domain. First, an overcomplete dictionary is constructed from trained frames. atoms of this redundant spectrum vectors that by K-SVD algorithm to ensure sparsity dictionary. For noisy spectrum, formant detection and quasi-SNR criterion first utilized determine whether bin spectrogram reliable, corresponding mask designed. mask-extracted reliable...
Mood disorders, including unipolar depression (UD) and bipolar disorder (BD) [1] , are reported to be one of the most common mental illnesses in recent years. In diagnostic evaluation on outpatients with mood disorder, a large portion BD patients initially misdiagnosed as having UD [2] . As previous research focused long-term monitoring short-term detection which could used early intervention is thus desirable. This work proposes an approach based patterns emotion elicited speech responses....
This study proposes a long-short term memory (LSTM)-based approach to text emotion recognition based on semantic word vector and emotional of the input text. For each in an text, is extracted from 2vec model. Besides, lexical projected all words defined affective lexicon derive vector. An autoencoder then adopted obtain bottleneck features for dimensionality reduction. The are concatenated with form final textual recognition. Finally, given feature sequence entire sentence, LSTM used by...
Perovskite-CIGS tandem solar cells offer unique benefits, such as high efficiencies and the possibility for cost-effective production on flexible, lightweight substrates. Yet, scaling up of these devices presents challenges, particularly because roughness, solvents, reactions that can impair integrity subcells within configurations. Here, we introduce a metal-free transparent conductive adhesive (TCA) material, implemented via lamination approach, scalable solution facilitates separate...
Abstract Background Establishing an efficient multidisciplinary team for transferred postpartum haemorrhage (PPH) cases is challenging due to limited clinical exposure. We hypothesised that leveraging trauma experience could effectively facilitate the development of such a within short timeframe. Methods In September 2019, was established at our tertiary care centre provide rapid management critical PPH from obstetric clinic, prioritising immediate resuscitation and haemostatic...
We investigate primordial gravity waves produced in the early universe within running vacuum model, which ensures a smooth transition from primeval inflationary epoch to radiation-dominant era, ultimately following standard Hot Big Bang trajectory. In contrast traditional methods, we approach gravitational wave equation by reformulating it as an inhomogeneous and addressing back-reaction problem. The effective potential, known Grishchuk drives cosmic expansion, is crucial damping amplitude...
This paper presents an approach to hierarchical prosody conversion for emotional speech synthesis. The pitch contour of the source is decomposed into a prosodic structure consisting sentence, word, and subsyllable levels. in higher level encoded by discrete Legendre polynomial coefficients. residual, difference between decoded from coefficients, then used modeling at lower level. For conversion, Gaussian mixture models (GMMs) are sentence- word-level conversion. At level, feature vectors...