Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Adversarial Examples with Difficult Common Words for Paraphrase Identification

Sep 06, 2019
Zhouxing Shi, Minlie Huang, Ting Yao, Jingfang Xu

Despite the success of deep models for paraphrase identification on benchmark datasets, these models are still vulnerable to adversarial examples. In this paper, we propose a novel algorithm to generate a new type of adversarial examples to study the robustness of deep paraphrase identification models. We first sample an original sentence pair from the corpus and then adversarially replace some word pairs with difficult common words. We take multiple steps and use beam search to find a modification solution that makes the target model fail, and thereby obtain an adversarial example. The word replacement is also constrained by heuristic rules and a language model, to preserve the label and grammaticality of the example during modification. Experiments show that our algorithm can generate adversarial examples on which the performance of the target model drops dramatically. Meanwhile, human annotators are much less affected, and the generated sentences retain a good grammaticality. We also show that adversarial training with generated adversarial examples can improve model robustness.

Share this with someone who'll enjoy it:

   Access Paper Source

Share this with someone who'll enjoy it: