Enhancing Facial Expression Recognition through Light Field Cameras

Robustness
DOI: 10.3390/s24175724 Publication Date: 2024-09-03T12:38:47Z
ABSTRACT
In this paper, we study facial expression recognition (FER) using three modalities obtained from a light field camera: sub-aperture (SA), depth map, and all-in-focus (AiF) images. Our objective is to construct more comprehensive effective FER system by investigating multimodal fusion strategies. For purpose, employ EfficientNetV2-S, pre-trained on AffectNet, as our primary convolutional neural network. This model, combined with BiGRU, used process SA We evaluate various techniques at both decision feature levels assess their effectiveness in enhancing accuracy. findings show that the model images surpasses state-of-the-art performance, achieving 88.13% ± 7.42% accuracy under subject-specific evaluation protocol 91.88% 3.25% subject-independent protocol. These results highlight model's potential robustness, outperforming existing methods. Furthermore, approach, integrating SA, AiF, images, demonstrates substantial improvements over unimodal models. The decision-level strategy, particularly average weights, proved most effective, 90.13% 4.95% 93.33% 4.92% approach leverages complementary strengths of each modality, resulting accurate system.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (32)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....