NFDI4DS | UHH-SEMS - Publication Details

VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization

Fuse (electrical) Modality (human–computer interaction) Concatenation (mathematics) Autoencoder Feature (linguistics) Code (set theory)

DOI: 10.1609/aaai.v35i7.16767 Publication Date: 2022-09-08T18:53:07Z

Abstract Supplemental Material References Cited by

AUTHORS (6)

Kaichen Zhou

Changhao Chen

Bing Wang

Muhamad Risqi U. ...

Niki Trigoni

Andrew Markham

ABSTRACT

Recent learning-based approaches have achieved impressive results in the field of single-shot camera localization. However, how best to fuse multiple modalities (e.g., image and depth) deal with degraded or missing input are less well studied. In particular, we note that previous towards deep fusion do not perform significantly better than models employing a single modality. We conjecture this is because naive feature space through summation concatenation which take into account different strengths each To address this, propose an end-to-end framework, termed VMLoc, sensor inputs common latent variational Product-of-Experts (PoE) followed by attention-based fusion. Unlike multimodal works directly adapting objective function vanilla auto-encoder, show localization can be accurately estimated unbiased based on importance weighting. Our model extensively evaluated RGB-D datasets prove efficacy our model. The source code available at https://github.com/Zalex97/VMLoc.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (14)

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....