NFDI4DS | UHH-SEMS - Publication Details

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization

Robustness

DOI: 10.48550/arxiv.2402.03317 Publication Date: 2024-01-02

Abstract Supplemental Material References Cited by

AUTHORS (6)

Xixu Hu

Runkai Zheng

Jindong Wang

Cheuk Hang Leung

Qi Wu

Xing Xie

ABSTRACT

Vision Transformers (ViTs) have gained prominence as a preferred choice for wide range of computer vision tasks due to their exceptional performance. However, widespread adoption has raised concerns about security in the face malicious attacks. Most existing methods rely on empirical adjustments during training process, lacking clear theoretical foundation. In this study, we address gap by introducing SpecFormer, specifically designed enhance ViTs' resilience against adversarial attacks, with support from carefully derived guarantees. We establish local Lipschitz bounds self-attention layer and introduce novel approach, Maximum Singular Value Penalization (MSVP), attain precise control over these bounds. seamlessly integrate MSVP into attention layers, using power iteration method enhanced computational efficiency. The modified model, effectively reduces spectral norms weight matrices, thereby enhancing network Lipschitzness. This, turn, leads improved efficiency robustness. Extensive experiments CIFAR ImageNet datasets confirm SpecFormer's superior performance defending

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....