Learning Aligned Cross-Modal Representation for Generalized Zero-Shot Classification
Autoencoder
Discriminative model
Representation
Feature Learning
DOI:
10.1609/aaai.v36i6.20614
Publication Date:
2022-07-04T11:23:51Z
AUTHORS (6)
ABSTRACT
Learning a common latent embedding by aligning the spaces of cross-modal autoencoders is an effective strategy for Generalized Zero-Shot Classification (GZSC). However, due to lack fine-grained instance-wise annotations, it still easily suffer from domain shift problem discrepancy between visual representation diversified images and semantic fixed attributes. In this paper, we propose innovative autoencoder network learning Aligned Cross-Modal Representations (dubbed ACMR) GZSC. Specifically, novel Vision-Semantic Alignment (VSA) method strengthen alignment features on subspaces guided learned classifier. addition, Information Enhancement Module (IEM) reduce possibility variables collapse meanwhile encouraging discriminative ability variables. Extensive experiments publicly available datasets demonstrate state-of-the-art performance our method.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (15)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....