Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

David Wagner

Minimum-Norm Adversarial Examples on KNN and KNN-Based Models

Mar 14, 2020

Chawin Sitawarin, David Wagner

Figure 1 for Minimum-Norm Adversarial Examples on KNN and KNN-Based Models

Figure 2 for Minimum-Norm Adversarial Examples on KNN and KNN-Based Models

Abstract:We study the robustness against adversarial examples of kNN classifiers and classifiers that combine kNN with neural networks. The main difficulty lies in the fact that finding an optimal attack on kNN is intractable for typical datasets. In this work, we propose a gradient-based attack on kNN and kNN-based defenses, inspired by the previous work by Sitawarin & Wagner [1]. We demonstrate that our attack outperforms their method on all of the models we tested with only a minimal increase in the computation time. The attack also beats the state-of-the-art attack [2] on kNN when k > 1 using less than 1% of its running time. We hope that this attack can be used as a new baseline for evaluating the robustness of kNN and its variants.

* 3rd Deep Learning and Security Workshop (co-located with the 41st IEEE Symposium on Security and Privacy)

Via

Access Paper or Ask Questions

Stateful Detection of Black-Box Adversarial Attacks

Jul 12, 2019

Steven Chen, Nicholas Carlini, David Wagner

Figure 1 for Stateful Detection of Black-Box Adversarial Attacks

Figure 2 for Stateful Detection of Black-Box Adversarial Attacks

Figure 3 for Stateful Detection of Black-Box Adversarial Attacks

Figure 4 for Stateful Detection of Black-Box Adversarial Attacks

Abstract:The problem of adversarial examples, evasion attacks on machine learning classifiers, has proven extremely difficult to solve. This is true even when, as is the case in many practical settings, the classifier is hosted as a remote service and so the adversary does not have direct access to the model parameters. This paper argues that in such settings, defenders have a much larger space of actions than have been previously explored. Specifically, we deviate from the implicit assumption made by prior work that a defense must be a stateless function that operates on individual examples, and explore the possibility for stateful defenses. To begin, we develop a defense designed to detect the process of adversarial example generation. By keeping a history of the past queries, a defender can try to identify when a sequence of queries appears to be for the purpose of generating an adversarial example. We then introduce query blinding, a new class of attacks designed to bypass defenses that rely on such a defense approach. We believe that expanding the study of adversarial examples from stateless classifiers to stateful systems is not only more realistic for many black-box settings, but also gives the defender a much-needed advantage in responding to the adversary.

Via

Access Paper or Ask Questions

Defending Against Adversarial Examples with K-Nearest Neighbor

Jun 23, 2019

Chawin Sitawarin, David Wagner

Figure 1 for Defending Against Adversarial Examples with K-Nearest Neighbor

Figure 2 for Defending Against Adversarial Examples with K-Nearest Neighbor

Figure 3 for Defending Against Adversarial Examples with K-Nearest Neighbor

Figure 4 for Defending Against Adversarial Examples with K-Nearest Neighbor

Abstract:Robustness is an increasingly important property of machine learning models as they become more and more prevalent. We propose a defense against adversarial examples based on a k-nearest neighbor (kNN) on the intermediate activation of neural networks. Our scheme surpasses state-of-the-art defenses on MNIST and CIFAR-10 against l2-perturbation by a significant margin. With our models, the mean perturbation norm required to fool our MNIST model is 3.07 and 2.30 on CIFAR-10. Additionally, we propose a simple certifiable lower bound on the l2-norm of the adversarial perturbation using a more specific version of our scheme, a 1-NN on representations learned by a Lipschitz network. Our model provides a nontrivial average lower bound of the perturbation norm, comparable to other schemes on MNIST with similar clean accuracy.

* Preprint

Via

Access Paper or Ask Questions

On the Robustness of Deep K-Nearest Neighbors

Mar 20, 2019

Chawin Sitawarin, David Wagner

Figure 1 for On the Robustness of Deep K-Nearest Neighbors

Figure 2 for On the Robustness of Deep K-Nearest Neighbors

Figure 3 for On the Robustness of Deep K-Nearest Neighbors

Figure 4 for On the Robustness of Deep K-Nearest Neighbors

Abstract:Despite a large amount of attention on adversarial examples, very few works have demonstrated an effective defense against this threat. We examine Deep k-Nearest Neighbor (DkNN), a proposed defense that combines k-Nearest Neighbor (kNN) and deep learning to improve the model's robustness to adversarial examples. It is challenging to evaluate the robustness of this scheme due to a lack of efficient algorithm for attacking kNN classifiers with large k and high-dimensional data. We propose a heuristic attack that allows us to use gradient descent to find adversarial examples for kNN classifiers, and then apply it to attack the DkNN defense as well. Results suggest that our attack is moderately stronger than any naive attack on kNN and significantly outperforms other attacks on DkNN.

* Published at Deep Learning and Security Workshop 2019 (IEEE S&P)

Via

Access Paper or Ask Questions

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Jul 31, 2018

Anish Athalye, Nicholas Carlini, David Wagner

Figure 1 for Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Figure 2 for Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Figure 3 for Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Abstract:We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.

* ICML 2018. Source code at https://github.com/anishathalye/obfuscated-gradients

Via

Access Paper or Ask Questions

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Mar 30, 2018

Nicholas Carlini, David Wagner

Figure 1 for Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Figure 2 for Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Figure 3 for Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Abstract:We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.

Via

Access Paper or Ask Questions

MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples

Nov 22, 2017

Nicholas Carlini, David Wagner

Figure 1 for MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples

Figure 2 for MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples

Figure 3 for MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples

Figure 4 for MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples

Abstract:MagNet and "Efficient Defenses..." were recently proposed as a defense to adversarial examples. We find that we can construct adversarial examples that defeat these defenses with only a slight increase in distortion.

Via

Access Paper or Ask Questions

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Nov 01, 2017

Nicholas Carlini, David Wagner

Figure 1 for Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Figure 2 for Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Figure 3 for Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Abstract:Neural networks are known to be vulnerable to adversarial examples: inputs that are close to natural inputs but classified incorrectly. In order to better understand the space of adversarial examples, we survey ten recent proposals that are designed for detection and compare their efficacy. We show that all can be defeated by constructing new loss functions. We conclude that adversarial examples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not. Finally, we propose several simple guidelines for evaluating future proposed defenses.

Via

Access Paper or Ask Questions

Towards Evaluating the Robustness of Neural Networks

Mar 22, 2017

Nicholas Carlini, David Wagner

Figure 1 for Towards Evaluating the Robustness of Neural Networks

Figure 2 for Towards Evaluating the Robustness of Neural Networks

Figure 3 for Towards Evaluating the Robustness of Neural Networks

Figure 4 for Towards Evaluating the Robustness of Neural Networks

Abstract:Neural networks provide state-of-the-art results for most machine learning tasks. Unfortunately, neural networks are vulnerable to adversarial examples: given an input $x$ and any target classification $t$, it is possible to find a new input $x'$ that is similar to $x$ but classified as $t$. This makes it difficult to apply neural networks in security-critical areas. Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from $95\%$ to $0.5\%$. In this paper, we demonstrate that defensive distillation does not significantly increase the robustness of neural networks by introducing three new attack algorithms that are successful on both distilled and undistilled neural networks with $100\%$ probability. Our attacks are tailored to three distance metrics used previously in the literature, and when compared to previous adversarial example generation algorithms, our attacks are often much more effective (and never worse). Furthermore, we propose using high-confidence adversarial examples in a simple transferability test we show can also be used to break defensive distillation. We hope our attacks will be used as a benchmark in future defense attempts to create neural networks that resist adversarial examples.

Via

Access Paper or Ask Questions

Spoofing 2D Face Detection: Machines See People Who Aren't There

Aug 06, 2016

Michael McCoyd, David Wagner

Figure 1 for Spoofing 2D Face Detection: Machines See People Who Aren't There

Figure 2 for Spoofing 2D Face Detection: Machines See People Who Aren't There

Figure 3 for Spoofing 2D Face Detection: Machines See People Who Aren't There

Figure 4 for Spoofing 2D Face Detection: Machines See People Who Aren't There

Abstract:Machine learning is increasingly used to make sense of the physical world yet may suffer from adversarial manipulation. We examine the Viola-Jones 2D face detection algorithm to study whether images can be created that humans do not notice as faces yet the algorithm detects as faces. We show that it is possible to construct images that Viola-Jones recognizes as containing faces yet no human would consider a face. Moreover, we show that it is possible to construct images that fool facial detection even when they are printed and then photographed.

* 9 pages, 19 figures, submitted to AISec

Via

Access Paper or Ask Questions