Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Dec 16, 2020

Chen Xing, Wencong Xiao, Yong Li, Wei Lin

Figure 1 for Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Figure 2 for Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Figure 3 for Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Figure 4 for Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Share this with someone who'll enjoy it:

Abstract:In this work, we propose to improve the effectiveness of language pre-training methods with the help of mis-predictions during pre-training. Neglecting words in the input sentence that have conflicting semantics with mis-predictions is likely to be the reason of generating mis-predictions at pre-training. Therefore, we hypothesis that mis-predictions during pre-training can act as detectors of the ill focuses of the model. If we train the model to focus more on the conflicts with the mis-predictions while focus less on the rest words in the input sentence, the mis-predictions can be more easily corrected and the entire model could be better trained. Towards this end, we introduce Focusing Less on Context of Mis-predictions(McMisP). In McMisP, we record the co-occurrence information between words to detect the conflicting words with mis-predictions in an unsupervised way. Then McMisP uses such information to guide the attention modules when a mis-prediction occurs. Specifically, several attention modules in the Transformer are optimized to focus more on words in the input sentence that have co-occurred rarely with the mis-predictions and vice versa. Results show that McMisP significantly expedites BERT and ELECTRA and improves their performances on downstream tasks.

View paper on

Share this with someone who'll enjoy it:

Title:Focusing More on Conflicts with Mis-Predictions Helps Language Pre-Training

Paper and Code