Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Defending Adversarial Attacks without Adversarial Attacks in Deep Reinforcement Learning

Aug 14, 2020
Xinghua Qu, Yew-Soon Ong, Abhishek Gupta, Zhu Sun

Share this with someone who'll enjoy it:

Many recent studies in deep reinforcement learning (DRL) have proposed to boost adversarial robustness through policy distillation utilizing adversarial training, where additional adversarial examples are added in the training process of the student policy; this makes the robustness improvement less flexible and more computationally expensive. In contrast, we propose an efficient policy distillation paradigm called robust policy distillation that is capable of achieving an adversarially robust student policy without relying on any adversarial example during student policy training. To this end, we devise a new policy distillation loss that consists of two terms: 1) a prescription gap maximization loss aiming at simultaneously maximizing the likelihood of the action selected by the teacher policy and the entropy over the remaining actions; 2) a Jacobian regularization loss that minimizes the magnitude of Jacobian with respect to the input state. The theoretical analysis proves that our distillation loss guarantees to increase the prescription gap and the adversarial robustness. Meanwhile, experiments on five Atari games firmly verifies the superiority of our policy distillation on boosting adversarial robustness compared to other state-of-the-arts.

   Access Paper Source

Share this with someone who'll enjoy it: