3D Former: Monocular Scene Reconstruction with 3D SDF Transformers

Monocular 3D Reconstruction Fuse (electrical) Feature (linguistics) Margin (machine learning)
DOI: 10.48550/arxiv.2301.13510 Publication Date: 2023-01-01
ABSTRACT
Monocular scene reconstruction from posed images is challenging due to the complexity of a large environment. Recent volumetric methods learn directly predict TSDF volume and have demonstrated promising results in this task. However, most focus on how extract fuse 2D features 3D feature volume, but none them improve way aggregated. In work, we propose an SDF transformer network, which replaces role CNN for better aggregation. To reduce explosive computation multi-head attention, sparse window attention module, where only calculated between non-empty voxels within local window. Then top-down-bottom-up network built aggregation, dilate-attention structure proposed prevent geometry degeneration, two global modules are employed equip with receptive fields. The experiments multiple datasets show that generates more accurate complete reconstruction, outperforms previous by margin. Remarkably, mesh accuracy improved 41.8%, completeness 25.3% ScanNet dataset. Project page: https://weihaosky.github.io/sdfformer.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....