NFDI4DS | UHH-SEMS - Publication Details

Data sanitization against adversarial label contamination based on data complexity

0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology

DOI: 10.1007/s13042-016-0629-5 Publication Date: 2017-01-24T19:05:00Z

Abstract Supplemental Material References Cited by

AUTHORS (4)

Patrick P. K. Chan

Zhi-Min He

Hongjiang Li

Chien-Chang Hsu

ABSTRACT

Machine learning techniques may suffer from adversarial attack in which an attacker misleads a learning process by manipulating training samples. Data sanitization is one of countermeasures against poisoning attack. It is a data pre-processing method which filters suspect samples before learning. Recently, a number of data sanitization methods are devised for label flip attack, but their flexibility is limited due to specific assumptions. It is observed that abrupt label flip caused by attack changes complexity of classification. A data sanitization method based on data complexity, which is a measure of the difficulty of classification on a dataset, is proposed in this paper. Our method measures the data complexity of a training set after removing a sample and its nearest samples. Contaminated samples are then distinguished from untainted samples according to their data complexity values. Experimental results support the idea that data complexity can be used to identify attack samples. The proposed method achieves a better result than the current sanitization method in terms of detection accuracy for well known security application problems.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (47)

CITATIONS (31)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products

PlumX Metrics

Data sanitization against adversarial label contamination based on data complexity

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....