NFDI4DS | UHH-SEMS - Publication Details

Acoustic Feature Excitation-and-Aggregation Network Based on Multi-Task Learning for Speech Emotion Recognition

DOI: 10.3390/electronics14050844 Publication Date: 2025-02-21T12:53:06Z

Abstract Supplemental Material References Cited by

AUTHORS (5)

Xin Qi

Qing Song

Guowei Chen

Pengzhou Zhang

Yao Fu

ABSTRACT

In recent years, substantial research has focused on emotion recognition using multi-stream speech representations. In existing multi-stream speech emotion recognition (SER) approaches, effectively extracting and fusing speech features is crucial. To overcome the bottleneck in SER caused by the fusion of inter-feature information, including challenges like modeling complex feature relations and the inefficiency of fusion methods, this paper proposes an SER framework based on multi-task learning, named AFEA-Net. The framework consists of a speech emotion alignment learning (SEAL), an acoustic feature excitation-and-aggregation mechanism (AFEA), and a continuity learning. First, SEAL aligns sentiment information between WavLM and Fbank features. Then, we design an acoustic feature excitation-and-aggregation mechanism to adaptively calibrate and merge the two features. Furthermore, we introduce a continuity learning strategy to explore the distinctiveness and complementarity of dual-stream features from intra- and inter-speech. Experimental results on the publicly available IEMOCAP and RAVDESS sentiment datasets show that our proposed approach outperforms state-of-the-art SER approaches. Specifically, we achieve 75.1% WA, 75.3% UAR, 76% precision, and 75.4% F1-score on IEMOCAP, and 80.3%, 80.6%, 80.8%, and 80.4% on RAVDESS, respectively.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (69)

CITATIONS (0)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products

PlumX Metrics

Acoustic Feature Excitation-and-Aggregation Network Based on Multi-Task Learning for Speech Emotion Recognition

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....