Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Piji Li

Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

Jun 20, 2021
Piji Li, Shuming Shi

Figure 1 for Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

Figure 2 for Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

Figure 3 for Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

Figure 4 for Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

We investigate the problem of Chinese Grammatical Error Correction (CGEC) and present a new framework named Tail-to-Tail (\textbf{TtT}) non-autoregressive sequence prediction to address the deep issues hidden in CGEC. Considering that most tokens are correct and can be conveyed directly from source to target, and the error positions can be estimated and corrected based on the bidirectional context information, thus we employ a BERT-initialized Transformer Encoder as the backbone model to conduct information modeling and conveying. Considering that only relying on the same position substitution cannot handle the variable-length correction cases, various operations such substitution, deletion, insertion, and local paraphrasing are required jointly. Therefore, a Conditional Random Fields (CRF) layer is stacked on the up tail to conduct non-autoregressive sequence prediction by modeling the token dependencies. Since most tokens are correct and easily to be predicted/conveyed to the target, then the models may suffer from a severe class imbalance issue. To alleviate this problem, focal loss penalty strategies are integrated into the loss functions. Moreover, besides the typical fix-length error correction datasets, we also construct a variable-length corpus to conduct experiments. Experimental results on standard datasets, especially on the variable-length datasets, demonstrate the effectiveness of TtT in terms of sentence-level Accuracy, Precision, Recall, and F1-Measure on tasks of error Detection and Correction.

* Accepted in the main conference of ACL 2021. Code: https://github.com/lipiji/TtT

Via

Access Paper or Ask Questions

Non-Autoregressive Text Generation with Pre-trained Language Models

Feb 16, 2021
Yixuan Su, Deng Cai, Yan Wang, David Vandyke, Simon Baker, Piji Li, Nigel Collier

Figure 1 for Non-Autoregressive Text Generation with Pre-trained Language Models

Figure 2 for Non-Autoregressive Text Generation with Pre-trained Language Models

Figure 3 for Non-Autoregressive Text Generation with Pre-trained Language Models

Figure 4 for Non-Autoregressive Text Generation with Pre-trained Language Models

Non-autoregressive generation (NAG) has recently attracted great attention due to its fast inference speed. However, the generation quality of existing NAG models still lags behind their autoregressive counterparts. In this work, we show that BERT can be employed as the backbone of a NAG model to greatly improve performance. Additionally, we devise mechanisms to alleviate the two common problems of vanilla NAG models: the inflexibility of prefixed output length and the conditional independence of individual token predictions. Lastly, to further increase the speed advantage of the proposed model, we propose a new decoding strategy, ratio-first, for applications where the output lengths can be approximately estimated beforehand. For a comprehensive evaluation, we test the proposed model on three text generation tasks, including text summarization, sentence compression and machine translation. Experimental results show that our model significantly outperforms existing non-autoregressive baselines and achieves competitive performance with many strong autoregressive models. In addition, we also conduct extensive analysis experiments to reveal the effect of each proposed component.

* Accepted to EACL 2021

Via

Access Paper or Ask Questions

Generating Diversified Comments via Reader-Aware Topic Modeling and Saliency Detection

Feb 13, 2021
Wei Wang, Piji Li, Hai-Tao Zheng

Figure 1 for Generating Diversified Comments via Reader-Aware Topic Modeling and Saliency Detection

Figure 2 for Generating Diversified Comments via Reader-Aware Topic Modeling and Saliency Detection

Figure 3 for Generating Diversified Comments via Reader-Aware Topic Modeling and Saliency Detection

Figure 4 for Generating Diversified Comments via Reader-Aware Topic Modeling and Saliency Detection

Automatic comment generation is a special and challenging task to verify the model ability on news content comprehension and language generation. Comments not only convey salient and interesting information in news articles, but also imply various and different reader characteristics which we treat as the essential clues for diversity. However, most of the comment generation approaches only focus on saliency information extraction, while the reader-aware factors implied by comments are neglected. To address this issue, we propose a unified reader-aware topic modeling and saliency information detection framework to enhance the quality of generated comments. For reader-aware topic modeling, we design a variational generative clustering algorithm for latent semantic learning and topic mining from reader comments. For saliency information detection, we introduce Bernoulli distribution estimating on news content to select saliency information. The obtained topic representations as well as the selected saliency information are incorporated into the decoder to generate diversified and informative comments. Experimental results on three datasets show that our framework outperforms existing baseline methods in terms of both automatic metrics and human evaluation. The potential ethical issues are also discussed in detail.

* AAAI 2021. The potential ethical issues are also discussed in detail

Via

Access Paper or Ask Questions

Abstractive Opinion Tagging

Jan 24, 2021
Qintong Li, Piji Li, Xinyi Li, Zhaochun Ren, Zhumin Chen, Maarten de Rijke

Figure 1 for Abstractive Opinion Tagging

Figure 2 for Abstractive Opinion Tagging

Figure 3 for Abstractive Opinion Tagging

Figure 4 for Abstractive Opinion Tagging

In e-commerce, opinion tags refer to a ranked list of tags provided by the e-commerce platform that reflect characteristics of reviews of an item. To assist consumers to quickly grasp a large number of reviews about an item, opinion tags are increasingly being applied by e-commerce platforms. Current mechanisms for generating opinion tags rely on either manual labelling or heuristic methods, which is time-consuming and ineffective. In this paper, we propose the abstractive opinion tagging task, where systems have to automatically generate a ranked list of opinion tags that are based on, but need not occur in, a given set of user-generated reviews. The abstractive opinion tagging task comes with three main challenges: (1) the noisy nature of reviews; (2) the formal nature of opinion tags vs. the colloquial language usage in reviews; and (3) the need to distinguish between different items with very similar aspects. To address these challenges, we propose an abstractive opinion tagging framework, named AOT-Net, to generate a ranked list of opinion tags given a large number of reviews. First, a sentence-level salience estimation component estimates each review's salience score. Next, a review clustering and ranking component ranks reviews in two steps: first, reviews are grouped into clusters and ranked by cluster size; then, reviews within each cluster are ranked by their distance to the cluster center. Finally, given the ranked reviews, a rank-aware opinion tagging component incorporates an alignment feature and alignment loss to generate a ranked list of opinion tags. To facilitate the study of this task, we create and release a large-scale dataset, called eComTag, crawled from real-world e-commerce websites. Extensive experiments conducted on the eComTag dataset verify the effectiveness of the proposed AOT-Net in terms of various evaluation metrics.

* Accepted by WSDM 2021

Via

Access Paper or Ask Questions

Predicting Events in MOBA Games: Dataset, Attribution, and Evaluation

Dec 24, 2020
Zelong Yang, Yan Wang, Piji Li, Shaobin Lin, Shuming Shi, Shao-Lun Huang

Figure 1 for Predicting Events in MOBA Games: Dataset, Attribution, and Evaluation

Figure 2 for Predicting Events in MOBA Games: Dataset, Attribution, and Evaluation

Figure 3 for Predicting Events in MOBA Games: Dataset, Attribution, and Evaluation

Figure 4 for Predicting Events in MOBA Games: Dataset, Attribution, and Evaluation

The multiplayer online battle arena (MOBA) games have become increasingly popular in recent years. Consequently, many efforts have been devoted to providing pre-game or in-game predictions for them. However, these works are limited in the following two aspects: 1) the lack of sufficient in-game features; 2) the absence of interpretability in the prediction results. These two limitations greatly restrict the practical performance and industrial application of the current works. In this work, we collect and release a large-scale dataset containing rich in-game features for the popular MOBA game Honor of Kings. We then propose to predict four types of important events in an interpretable way by attributing the predictions to the input features using two gradient-based attribution methods: Integrated Gradients and SmoothGrad. To evaluate the explanatory power of different models and attribution methods, a fidelity-based evaluation metric is further proposed. Finally, we evaluate the accuracy and Fidelity of several competitive methods on the collected dataset to assess how well machines predict events in MOBA games.

Via

Access Paper or Ask Questions

Predicting Events In MOBA Games: Dataset, Attribution, and Evaluation

Dec 17, 2020
Zelong Yang, Yan Wang, Piji Li, Shaobin Lin, Shuming Shi, Shao-Lun Huang

The multiplayer online battle arena (MOBA) games are becoming increasingly popular in recent years. Consequently, many efforts have been devoted to providing pre-game or in-game predictions for MOBA games. However, these works are limited in the following two aspects: 1) the lack of sufficient in-game features; 2) the absence of interpretability in the prediction results. These two limitations greatly restrict their practical performances and industrial applications. In this work, we collect and release a large-scale dataset containing rich in-game features for the popular MOBA game Honor of Kings. We then propose to predict four types of important events in an interpretable way by attributing the predictions to the input features using two gradient-based attribution methods: Integrated Gradients and SmoothGrad. To evaluate the explanatory power of different models and attribution methods, a fidelity-based evaluation metric is further proposed. Finally, we evaluate the accuracy and Fidelity of several competitive methods on the collected dataset to assess how well do machines predict the events in MOBA games.

Via

Access Paper or Ask Questions

Consistency and Coherency Enhanced Story Generation

Oct 17, 2020
Wei Wang, Piji Li, Hai-Tao Zheng

Figure 1 for Consistency and Coherency Enhanced Story Generation

Figure 2 for Consistency and Coherency Enhanced Story Generation

Figure 3 for Consistency and Coherency Enhanced Story Generation

Figure 4 for Consistency and Coherency Enhanced Story Generation

Story generation is a challenging task, which demands to maintain consistency of the plots and characters throughout the story. Previous works have shown that GPT2, a large-scale language model, has achieved good performance on story generation. However, we observe that several serious issues still exist in the stories generated by GPT2 which can be categorized into two folds: consistency and coherency. In terms of consistency, on one hand, GPT2 cannot guarantee the consistency of the plots explicitly. On the other hand, the generated stories usually contain coreference errors. In terms of coherency, GPT2 does not take account of the discourse relations between sentences of stories directly. To enhance the consistency and coherency of the generated stories, we propose a two-stage generation framework, where the first stage is to organize the story outline which depicts the story plots and events, and the second stage is to expand the outline into a complete story. Therefore the plots consistency can be controlled and guaranteed explicitly. In addition, coreference supervision signals are incorporated to reduce coreference errors and improve the coreference consistency. Moreover, we design an auxiliary task of discourse relation modeling to improve the coherency of the generated stories. Experimental results on a story dataset show that our model outperforms the baseline approaches in terms of both automatic metrics and human evaluation.

Via

Access Paper or Ask Questions

Dialogue Generation on Infrequent Sentence Functions via Structured Meta-Learning

Oct 04, 2020
Yifan Gao, Piji Li, Wei Bi, Xiaojiang Liu, Michael R. Lyu, Irwin King

Figure 1 for Dialogue Generation on Infrequent Sentence Functions via Structured Meta-Learning

Figure 2 for Dialogue Generation on Infrequent Sentence Functions via Structured Meta-Learning

Figure 3 for Dialogue Generation on Infrequent Sentence Functions via Structured Meta-Learning

Figure 4 for Dialogue Generation on Infrequent Sentence Functions via Structured Meta-Learning

Sentence function is an important linguistic feature indicating the communicative purpose in uttering a sentence. Incorporating sentence functions into conversations has shown improvements in the quality of generated responses. However, the number of utterances for different types of fine-grained sentence functions is extremely imbalanced. Besides a small number of high-resource sentence functions, a large portion of sentence functions is infrequent. Consequently, dialogue generation conditioned on these infrequent sentence functions suffers from data deficiency. In this paper, we investigate a structured meta-learning (SML) approach for dialogue generation on infrequent sentence functions. We treat dialogue generation conditioned on different sentence functions as separate tasks, and apply model-agnostic meta-learning to high-resource sentence functions data. Furthermore, SML enhances meta-learning effectiveness by promoting knowledge customization among different sentence functions but simultaneously preserving knowledge generalization for similar sentence functions. Experimental results demonstrate that SML not only improves the informativeness and relevance of generated responses, but also can generate responses consistent with the target sentence functions.

* EMNLP 2020, Findings, 10 pages, 4 figures

Via

Access Paper or Ask Questions