Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Oct 14, 2020

Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Figure 1 for InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Figure 2 for InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Figure 3 for InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Figure 4 for InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Share this with someone who'll enjoy it:

Abstract:Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks. Recent studies, however, show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks. We aim to address this problem from an information-theoretic perspective, and propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models. InfoBERT contains two mutual-information-based regularizers for model training: (i) an Information Bottleneck regularizer, which suppresses noisy mutual information between the input and the feature representation; and (ii) a Robust Feature regularizer, which increases the mutual information between local robust features and global features. We provide a principled way to theoretically analyze and improve the robustness of representation learning for language models in both standard and adversarial training. Extensive experiments demonstrate that InfoBERT achieves state-of-the-art robust accuracy over several adversarial datasets on Natural Language Inference (NLI) and Question Answering (QA) tasks.

* 20 pages, 8 tables, 2 figures

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Paper and Code