Detecting Adversarial Attacks via Subset Scanning of Autoencoder Activations and Reconstruction Error
Autoencoder
Leverage (statistics)
DOI:
10.24963/ijcai.2020/122
Publication Date:
2020-07-08T12:12:10Z
AUTHORS (7)
ABSTRACT
Reliably detecting attacks in a given set of inputs is high practical relevance because the vulnerability neural networks to adversarial examples. These altered create security risk applications with real-world consequences, such as self-driving cars, robotics and financial services. We propose an unsupervised method for inner layers autoencoder (AE) by maximizing non-parametric measure anomalous node activations. Previous work this space has shown AE can detect images thresholding reconstruction error produced final layer. Furthermore, other detection methods rely on data augmentation or specialized training techniques which must be asserted before time. In contrast, we use subset scanning from pattern domain enhance power without labeled examples noise, retraining methods. addition “score” our proposed also returns nodes within network that contributed score. This will allow future pivot visualisation explainability. Our approach shows consistently higher than existing across several noise models wide range perturbation strengths.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (8)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....