NFDI4DS | UHH-SEMS - Publication Details

TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation

Epipolar geometry Merge (version control) ENCODE

DOI: 10.48550/arxiv.2110.09554 Publication Date: 2021-01-01

Abstract Supplemental Material References Cited by

AUTHORS (10)

Haoyu Ma

Liangjian Chen

Deying Kong

Zhe Wang

Xuemei Liu

Hao Tang

Xiangyi Yan

Yusheng Xie

Shih-Yao Lin

Xiaohui Xie

ABSTRACT

Estimating the 2D human poses in each view is typically first step calibrated multi-view 3D pose estimation. But performance of detectors suffers from challenging situations such as occlusions and oblique viewing angles. To address these challenges, previous works derive point-to-point correspondences between different views epipolar geometry utilize to merge prediction heatmaps or feature representations. Instead post-prediction merge/calibration, here we introduce a transformer framework for estimation, aiming at directly improving individual predictors by integrating information views. Inspired multi-modal transformers, design unified architecture, named TransFusion, fuse cues both current neighboring Moreover, propose concept field encode positional into model. The position encoding guided provides an efficient way pixels Experiments on Human 3.6M Ski-Pose show that our method more has consistent improvements compared other fusion methods. Specifically, achieve 25.8 mm MPJPE with only 5M parameters 256 x resolution.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....