BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Code (set theory)
DOI: 10.48550/arxiv.2004.09984 Publication Date: 2020-01-01
ABSTRACT
Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous images) since it is difficult to generate adversarial samples with gradient-based methods. Current successful attack methods texts usually adopt heuristic replacement strategies on the character or word level, which remains find optimal solution in massive space of possible combinations replacements while preserving semantic consistency and language fluency. In this paper, we propose \textbf{BERT-Attack}, a high-quality effective method using pre-trained masked models exemplified by BERT. We turn BERT against its fine-tuned other deep neural downstream tasks so that can successfully mislead target predict incorrectly. Our outperforms state-of-the-art both success rate perturb percentage, generated are fluent semantically preserved. Also, cost calculation low, thus large-scale generations. The code available at https://github.com/LinyangLee/BERT-Attack.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....