Privacy-preserving Universal Adversarial Defense for Black-box Models

Black box
DOI: 10.48550/arxiv.2408.10647 Publication Date: 2024-08-20
ABSTRACT
Deep neural networks (DNNs) are increasingly used in critical applications such as identity authentication and autonomous driving, where robustness against adversarial attacks is crucial. These can exploit minor perturbations to cause significant prediction errors, making it essential enhance the resilience of DNNs. Traditional defense methods often rely on access detailed model information, which raises privacy concerns, owners may be reluctant share data. In contrast, existing black-box fail offer a universal various types attacks. To address these challenges, we introduce DUCD, method that does not require target model's parameters or architecture. Our approach involves distilling by querying with data, creating white-box surrogate while preserving data privacy. We further this using certified based randomized smoothing optimized noise selection, enabling robust broad range Comparative evaluations between defenses models demonstrate effectiveness our approach. Experiments multiple image classification datasets show DUCD only outperforms but also matches accuracy defenses, all enhancing reducing success rate membership inference
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....