NFDI4DS | UHH-SEMS - Publication Details

Guangyan Zhang

ORCID: 0000-0002-8640-8933

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5019754670

Research Areas

Speech and Audio Processing
Speech Recognition and Synthesis
Music and Audio Processing

Alibaba Group (China)
2023

Chinese University of Hong Kong
2023

iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis Based on Disentanglement Between Prosody and Timbre

OPENALEX - Publications

Guangyan Zhang Ying Qin Wenjie Zhang Jialun Wu Mei Li and 3 more

The capability of generating speech with a specific type emotion is desired for many human-computer interaction applications. Cross-speaker transfer common approach to emotional when data labels from target speakers not available model training. This paper presents novel cross-speaker system named iEmoTTS. composed an encoder, prosody predictor, and timbre encoder. encoder extracts the identity respective intensity mel-spectrogram input speech. measured by posterior probability that...

10.1109/taslp.2023.3268571 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2023-01-01

Coming Soon ...