Alert button

"That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks

Add code
Bookmark button
Alert button
Apr 10, 2022
Edoardo Mosca, Shreyash Agarwal, Javier Rando-Ramirez, Georg Groh

Figure 1 for "That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks
Figure 2 for "That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks
Figure 3 for "That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks
Figure 4 for "That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: