NFDI4DS | UHH-SEMS - Publication Details

Probing the 3D Awareness of Visual Foundation Models

Foundation (evidence) ENCODE Code (set theory) Ask price

DOI: 10.48550/arxiv.2404.08636 Publication Date: 2024-04-12

Abstract Supplemental Material References Cited by

AUTHORS (10)

Mohamed El Banani

Amit Raj

Kevis-Kokitsi Man...

Abhishek Kar

Yuanzhen Li

Michael Rubinstein

Deqing Sun

Leonidas Guibas

Justin Johnson

Varun Jampani

ABSTRACT

Recent advances in large-scale pretraining have yielded visual foundation models with strong capabilities. Not only can recent generalize to arbitrary images for their training task, intermediate representations are useful other tasks such as detection and segmentation. Given that classify, delineate, localize objects 2D, we ask whether they also represent 3D structure? In this work, analyze the awareness of models. We posit implies (1) encode structure scene (2) consistently surface across views. conduct a series experiments using task-specific probes zero-shot inference procedures on frozen features. Our reveal several limitations current code analysis be found at https://github.com/mbanani/probe3d.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Probing the 3D Awareness of Visual Foundation Models

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....