Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qun Liu

NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Aug 31, 2019
Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen, Qun Liu

Figure 1 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Figure 2 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Figure 3 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Figure 4 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding

The pre-trained language models have achieved great successes in various natural language understanding (NLU) tasks due to its capacity to capture the deep contextualized information in text by pre-training on large-scale corpora. In this technical report, we present our practice of pre-training language models named NEZHA (NEural contextualiZed representation for CHinese lAnguage understanding) on Chinese corpora and finetuning for the Chinese NLU tasks. The current version of NEZHA is based on BERT with a collection of proven improvements, which include Functional Relative Positional Encoding as an effective positional encoding scheme, Whole Word Masking strategy, Mixed Precision Training and the LAMB Optimizer in training the models. The experimental results show that NEZHA achieves the state-of-the-art performances when finetuned on several representative Chinese tasks, including named entity recognition (People's Daily NER), sentence matching (LCQMC), Chinese sentiment classification (ChnSenti) and natural language inference (XNLI).

Via

Access Paper or Ask Questions

Dialog State Tracking with Reinforced Data Augmentation

Aug 21, 2019
Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu

Figure 1 for Dialog State Tracking with Reinforced Data Augmentation

Figure 2 for Dialog State Tracking with Reinforced Data Augmentation

Figure 3 for Dialog State Tracking with Reinforced Data Augmentation

Figure 4 for Dialog State Tracking with Reinforced Data Augmentation

Neural dialog state trackers are generally limited due to the lack of quantity and diversity of annotated training data. In this paper, we address this difficulty by proposing a reinforcement learning (RL) based framework for data augmentation that can generate high-quality data to improve the neural state tracker. Specifically, we introduce a novel contextual bandit generator to learn fine-grained augmentation policies that can generate new effective instances by choosing suitable replacements for the specific context. Moreover, by alternately learning between the generator and the state tracker, we can keep refining the generative policies to generate more high-quality training data for neural state tracker. Experimental results on the WoZ and MultiWoZ (restaurant) datasets demonstrate that the proposed framework significantly improves the performance over the state-of-the-art models, especially with limited training data.

* Under review

Via

Access Paper or Ask Questions

PCGAN-CHAR: Progressively Trained Classifier Generative Adversarial Networks for Classification of Noisy Handwritten Bangla Characters

Aug 11, 2019
Qun Liu, Edward Collier, Supratik Mukhopadhyay

Figure 1 for PCGAN-CHAR: Progressively Trained Classifier Generative Adversarial Networks for Classification of Noisy Handwritten Bangla Characters

Figure 2 for PCGAN-CHAR: Progressively Trained Classifier Generative Adversarial Networks for Classification of Noisy Handwritten Bangla Characters

Figure 3 for PCGAN-CHAR: Progressively Trained Classifier Generative Adversarial Networks for Classification of Noisy Handwritten Bangla Characters

Figure 4 for PCGAN-CHAR: Progressively Trained Classifier Generative Adversarial Networks for Classification of Noisy Handwritten Bangla Characters

Due to the sparsity of features, noise has proven to be a great inhibitor in the classification of handwritten characters. To combat this, most techniques perform denoising of the data before classification. In this paper, we consolidate the approach by training an all-in-one model that is able to classify even noisy characters. For classification, we progressively train a classifier generative adversarial network on the characters from low to high resolution. We show that by learning the features at each resolution independently a trained model is able to accurately classify characters even in the presence of noise. We experimentally demonstrate the effectiveness of our approach by classifying noisy versions of MNIST, handwritten Bangla Numeral, and Basic Character datasets.

* Paper was accepted at the 21st International Conference on Asia-Pacific Digital Libraries (ICADL 2019)

Via

Access Paper or Ask Questions

GPT-based Generation for Classical Chinese Poetry

Jul 12, 2019
Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang

Figure 1 for GPT-based Generation for Classical Chinese Poetry

Figure 2 for GPT-based Generation for Classical Chinese Poetry

Figure 3 for GPT-based Generation for Classical Chinese Poetry

Figure 4 for GPT-based Generation for Classical Chinese Poetry

We present a simple yet effective method for generating high quality classical Chinese poetry with Generative Pre-trained Language Model (GPT). The method adopts a simple GPT model, without using any human crafted rules or features, or designing any additional neural components. While the proposed model learns to generate various forms of classical Chinese poems, including Jueju, L\"{u}shi, various Cipai and Couples, the generated poems are of very high quality. We also propose and implement a method to fine-tune the model to generate acrostic poetry. To the best of our knowledge, this is the first to employ GPT in developing a poetry generation system. We will release an online demonstration system in the near future to show the generation capability of the proposed method for classical Chinese poetry.

Via

Access Paper or Ask Questions

Modeling Semantic Compositionality with Sememe Knowledge

Jul 10, 2019
Fanchao Qi, Junjie Huang, Chenghao Yang, Zhiyuan Liu, Xiao Chen, Qun Liu, Maosong Sun

Figure 1 for Modeling Semantic Compositionality with Sememe Knowledge

Figure 2 for Modeling Semantic Compositionality with Sememe Knowledge

Figure 3 for Modeling Semantic Compositionality with Sememe Knowledge

Figure 4 for Modeling Semantic Compositionality with Sememe Knowledge

Semantic compositionality (SC) refers to the phenomenon that the meaning of a complex linguistic unit can be composed of the meanings of its constituents. Most related works focus on using complicated compositionality functions to model SC while few works consider external knowledge in models. In this paper, we verify the effectiveness of sememes, the minimum semantic units of human languages, in modeling SC by a confirmatory experiment. Furthermore, we make the first attempt to incorporate sememe knowledge into SC models, and employ the sememeincorporated models in learning representations of multiword expressions, a typical task of SC. In experiments, we implement our models by incorporating knowledge from a famous sememe knowledge base HowNet and perform both intrinsic and extrinsic evaluations. Experimental results show that our models achieve significant performance boost as compared to the baseline methods without considering sememe knowledge. We further conduct quantitative analysis and case studies to demonstrate the effectiveness of applying sememe knowledge in modeling SC. All the code and data of this paper can be obtained on https://github.com/thunlp/Sememe-SC.

* To appear at ACL 2019

Via

Access Paper or Ask Questions

Decomposable Neural Paraphrase Generation

Jun 24, 2019
Zichao Li, Xin Jiang, Lifeng Shang, Qun Liu

Figure 1 for Decomposable Neural Paraphrase Generation

Figure 2 for Decomposable Neural Paraphrase Generation

Figure 3 for Decomposable Neural Paraphrase Generation

Figure 4 for Decomposable Neural Paraphrase Generation

Paraphrasing exists at different granularity levels, such as lexical level, phrasal level and sentential level. This paper presents Decomposable Neural Paraphrase Generator (DNPG), a Transformer-based model that can learn and generate paraphrases of a sentence at different levels of granularity in a disentangled way. Specifically, the model is composed of multiple encoders and decoders with different structures, each of which corresponds to a specific granularity. The empirical study shows that the decomposition mechanism of DNPG makes paraphrase generation more interpretable and controllable. Based on DNPG, we further develop an unsupervised domain adaptation method for paraphrase generation. Experimental results show that the proposed model achieves competitive in-domain performance compared to the state-of-the-art neural models, and significantly better performance when adapting to a new domain.

* To appear in ACL 2019

Via

Access Paper or Ask Questions

Bridging the Gap between Training and Inference for Neural Machine Translation

Jun 17, 2019
Wen Zhang, Yang Feng, Fandong Meng, Di You, Qun Liu

Figure 1 for Bridging the Gap between Training and Inference for Neural Machine Translation

Figure 2 for Bridging the Gap between Training and Inference for Neural Machine Translation

Figure 3 for Bridging the Gap between Training and Inference for Neural Machine Translation

Figure 4 for Bridging the Gap between Training and Inference for Neural Machine Translation

Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words. At training time, it predicts with the ground truth words as context while at inference it has to generate the entire sequence from scratch. This discrepancy of the fed context leads to error accumulation among the way. Furthermore, word-level training requires strict matching between the generated sequence and the ground truth sequence which leads to overcorrection over different but reasonable translations. In this paper, we address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum. Experiment results on Chinese->English and WMT'14 English->German translation tasks demonstrate that our approach can achieve significant improvements on multiple datasets.

* 10 pages, 7 figures

Via

Access Paper or Ask Questions

ERNIE: Enhanced Language Representation with Informative Entities

Jun 04, 2019
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, Qun Liu

Figure 1 for ERNIE: Enhanced Language Representation with Informative Entities

Figure 2 for ERNIE: Enhanced Language Representation with Informative Entities

Figure 3 for ERNIE: Enhanced Language Representation with Informative Entities

Figure 4 for ERNIE: Enhanced Language Representation with Informative Entities

Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The source code of this paper can be obtained from https://github.com/thunlp/ERNIE.

* Accepted by ACL 2019

Via

Access Paper or Ask Questions