Multimodal Sentiment Analysis in Realistic Environments Based on Cross-Modal Hierarchical Fusion Network
Feature (linguistics)
Modalities
Sensor Fusion
DOI:
10.3390/electronics12163504
Publication Date:
2023-08-18T13:13:44Z
AUTHORS (4)
ABSTRACT
In the real world, multimodal sentiment analysis (MSA) enables capture and of sentiments by fusing information, thereby enhancing understanding real-world environments. The key challenges lie in handling noise acquired data achieving effective fusion. When processing data, existing methods utilize combination features to mitigate errors word recognition caused performance limitations automatic speech (ASR) models. However, there still remains problem how more efficiently combine different modalities address noise. fusion, most fusion have limited adaptability feature differences between modalities, making it difficult potential complex nonlinear interactions that may exist modalities. To overcome aforementioned issues, this paper proposes a new framework named multimodal-word-refinement cross-modal-hierarchy (MWRCMH) Specifically, we utilized correction module reduce ASR. During designed cross-modal hierarchical employed attention mechanisms fuse pairs resulting fused bimodal-feature information. Then, obtained bimodal information unimodal were through layer obtain final Experimental results on MOSI-SpeechBrain, MOSI-IBM, MOSI-iFlytek datasets demonstrated proposed approach outperformed other comparative methods, Has0-F1 scores 76.43%, 80.15%, 81.93%, respectively. Our exhibited better performance, as compared multiple baselines.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (52)
CITATIONS (8)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....