NFDI4DS | UHH-SEMS - Publication Details

Learning Robust 3D Representation from CLIP via Dual Denoising

Representation

DOI: 10.48550/arxiv.2407.00905 Publication Date: 2024-06-30

Abstract Supplemental Material References Cited by

AUTHORS (3)

Shuqing Luo

Bowen Qu

Wei Gao

ABSTRACT

In this paper, we explore a critical yet under-investigated issue: how to learn robust and well-generalized 3D representation from pre-trained vision language models such as CLIP. Previous works have demonstrated that cross-modal distillation can provide rich useful knowledge for data. However, like most deep learning models, the resultant network is still vulnerable adversarial attacks especially iterative attack. work, propose Dual Denoising, novel framework representations It combines denoising-based proxy task with feature denoising pre-training. Additionally, utilizing parallel noise inference enhance generalization of point cloud features under cross domain settings. Experiments show our model effectively improve performance robustness zero-shot settings without training. Our code available at https://github.com/luoshuqing2001/Dual_Denoising.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Learning Robust 3D Representation from CLIP via Dual Denoising

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....