HotFlip: White-Box Adversarial Examples for Text Classification
White box
DOI:
10.18653/v1/p18-2006
Publication Date:
2019-06-29T15:52:10Z
AUTHORS (4)
ABSTRACT
We propose an efficient method to generate white-box adversarial examples trick a character-level neural classifier. find that only few manipulations are needed greatly decrease the accuracy. Our relies on atomic flip operation, which swaps one token for another, based gradients of one-hot input vectors. Due efficiency our method, we can perform training makes model more robust attacks at test time. With use semantics-preserving constraints, demonstrate HotFlip be adapted attack word-level classifier as well.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (324)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....