Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Adversarial Examples Target Topological Holes in Deep Networks

Jan 28, 2019
Thomas Gebhart, Paul Schrater

It is currently unclear why adversarial examples are easy to construct for deep networks that are otherwise successful with respect to their training domain. However, it is suspected that these adversarial examples lie within some small perturbation from the network's decision boundaries or exist in low-density regions with respect to the training distribution. Using persistent homology, we find that deep networks effectively have ``holes'' in their activation graphs, making them blind to regions of the input space that can be exploited by adversarial examples. These holes are effectively dense in the input space, making it easy to find a perturbed image that can be misclassified. By studying the topology of network activation, we find global patterns in the form of activation subgraphs which can both reliably determine whether an example is adversarial and can recover the true category of the example well above chance, implying that semantic information about the input is embedded globally via the activation pattern in deep networks.

Share this with someone who'll enjoy it:

   Access Paper Source

Share this with someone who'll enjoy it: