Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Jul 15, 2020

Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, Heyan Huang, Ming Zhou

Figure 1 for InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Figure 2 for InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Figure 3 for InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Figure 4 for InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Share this with someone who'll enjoy it:

Abstract:In this work, we formulate cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts. The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, the information-theoretic framework inspires us to propose a pre-training task based on contrastive learning. Given a bilingual sentence pair, we regard them as two views of the same meaning, and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models. Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at http://aka.ms/infoxlm.

* 11 pages

View paper on

Share this with someone who'll enjoy it:

Title:InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

Paper and Code