Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

Jun 26, 2020

Kaidi Jin, Tianwei Zhang, Chao Shen, Yufei Chen, Ming Fan, Chenhao Lin, Ting Liu

Figure 1 for A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

Figure 2 for A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

Figure 3 for A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

Figure 4 for A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

Share this with someone who'll enjoy it:

Abstract:Deep Neural Networks are well known to be vulnerable to adversarial attacks and backdoor attacks, where minor modifications on the input can mislead the models to give wrong results. Although defenses against adversarial attacks have been widely studied, research on mitigating backdoor attacks is still at an early stage. It is unknown whether there are any connections and common characteristics between the defenses against these two attacks. In this paper, we present a unified framework for detecting malicious examples and protecting the inference results of Deep Learning models. This framework is based on our observation that both adversarial examples and backdoor examples have anomalies during the inference process, highly distinguishable from benign samples. As a result, we repurpose and revise four existing adversarial defense methods for detecting backdoor examples. Extensive evaluations indicate these approaches provide reliable protection against backdoor attacks, with a higher accuracy than detecting adversarial examples. These solutions also reveal the relations of adversarial examples, backdoor examples and normal samples in model sensitivity, activation space and feature space. This can enhance our understanding about the inherent features of these two attacks, as well as the defense opportunities.

View paper on

Share this with someone who'll enjoy it:

Title:A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models

Paper and Code