Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed

PatchGuard: Provable Defense against Adversarial Patches Using Masks on Small Receptive Fields

May 17, 2020
Chong Xiang, Arjun Nitin Bhagoji, Vikash Sehwag, Prateek Mittal



Localized adversarial patches aim to induce misclassification in machine learning models by arbitrarily modifying pixels within a restricted region of an image. Such attacks can be realized in the physical world by attaching the adversarial patch to the object to be misclassified. In this paper, we propose a general defense framework that can achieve both high clean accuracy and provable robustness against localized adversarial patches. The cornerstone of our defense framework is to use a convolutional network with small receptive fields that impose a bound on the number of features corrupted by an adversarial patch. We further present the robust masking defense that robustly detects and masks corrupted features for a secure feature aggregation. We evaluate our defense against the most powerful white-box untargeted adaptive attacker and achieve a 92.3% clean accuracy and an 85.2% provable robust accuracy on a 10-class subset of ImageNet against a 31x31 adversarial patch (2% pixels), a 57.4% clean accuracy and a 14.4% provable robust accuracy on 1000-class ImageNet against a 31x31 patch (2% pixels), and an 80.3% clean accuracy and a 61.3% provable accuracy on CIFAR-10 against a 5x5 patch (2.4% pixels). Notably, our provable defenses achieve state-of-the-art provable robust accuracy on ImageNet and CIFAR-10.



Share this with someone who'll enjoy it:

   Access Paper Source



Share this with someone who'll enjoy it: