Get our free extension to see links to code for papers anywhere online!

# Provable Defense Against Delusive Poisoning

Feb 09, 2021
Lue Tao, Lei Feng, Jinfeng Yi, Sheng-Jun Huang, Songcan Chen

Share this with someone who'll enjoy it:

Delusive poisoning is a special kind of attack to obstruct learning, where the learning performance could be significantly deteriorated by only manipulating (even slightly) the features of correctly labeled training examples. By formalizing this malicious attack as finding the worst-case distribution shift at training time within a specific $\infty$-Wasserstein ball, we show that minimizing adversarial risk on the poison data is equivalent to optimizing an upper bound of natural risk on the original data. This implies that adversarial training can be a principled defense method against delusive poisoning. To further understand the internal mechanism of the defense, we disclose that adversarial training can resist the training distribution shift by preventing the learner from overly relying on non-robust features in a natural setting. Finally, we complement our theoretical findings with a set of experiments on popular benchmark datasets, which shows that the defense withstands six different practical attacks. Both theoretical and empirical results vote for adversarial training when confronted with delusive poisoning.

Access Paper Source

Share this with someone who'll enjoy it: