Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?
Robustness
Threat model
DOI:
10.48550/arxiv.2203.08392
Publication Date:
2022-01-01
AUTHORS (5)
ABSTRACT
Vision transformers (ViTs) have recently set off a new wave in neural architecture design thanks to their record-breaking performance various vision tasks. In parallel, fulfill the goal of deploying ViTs into real-world applications, robustness against potential malicious attacks has gained increasing attention. particular, recent works show that are more robust adversarial as compared with convolutional networks (CNNs), and conjecture this is because focus on capturing global interactions among different input/feature patches, leading improved local perturbations imposed by attacks. work, we ask an intriguing question: "Under what kinds do become vulnerable learners CNNs?" Driven question, first conduct comprehensive experiment regarding both CNNs under existing understand underlying reason favoring robustness. Based drawn insights, then propose dedicated attack framework, dubbed Patch-Fool, fools self-attention mechanism attacking its basic component (i.e., single patch) series attention-aware optimization techniques. Interestingly, our Patch-Fool framework shows for time not necessarily than perturbations. find which consistent across extensive experiments, observations from Sparse/Mild two variants indicate insight perturbation density strength each patch seem be key factors influence ranking between CNNs.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....