Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Oct 16, 2020

Yanru Qu, Dinghan Shen, Yelong Shen, Sandra Sajeev, Jiawei Han, Weizhu Chen

Figure 1 for CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Figure 2 for CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Figure 3 for CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Figure 4 for CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Share this with someone who'll enjoy it:

Abstract:Data augmentation has been demonstrated as an effective strategy for improving model generalization and data efficiency. However, due to the discrete nature of natural language, designing label-preserving transformations for text data tends to be more challenging. In this paper, we propose a novel data augmentation framework dubbed CoDA, which synthesizes diverse and informative augmented examples by integrating multiple transformations organically. Moreover, a contrastive regularization objective is introduced to capture the global relationship among all the data samples. A momentum encoder along with a memory bank is further leveraged to better estimate the contrastive loss. To verify the effectiveness of the proposed framework, we apply CoDA to Transformer-based models on a wide range of natural language understanding tasks. On the GLUE benchmark, CoDA gives rise to an average improvement of 2.2% while applied to the RoBERTa-large model. More importantly, it consistently exhibits stronger results relative to several competitive data augmentation and adversarial training base-lines (including the low-resource settings). Extensive experiments show that the proposed contrastive objective can be flexibly combined with various data augmentation approaches to further boost their performance, highlighting the wide applicability of the CoDA framework.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Paper and Code