Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wai Lam

Narrative Incoherence Detection

Dec 21, 2020

Deng Cai, Yizhe Zhang, Yichen Huang, Wai Lam, Bill Dolan

Figure 1 for Narrative Incoherence Detection

Figure 2 for Narrative Incoherence Detection

Figure 3 for Narrative Incoherence Detection

Figure 4 for Narrative Incoherence Detection

Abstract:Motivated by the increasing popularity of intelligent editing assistant, we introduce and investigate the task of narrative incoherence detection: Given a (corrupted) long-form narrative, decide whether there exists some semantic discrepancy in the narrative flow. Specifically, we focus on the missing sentence and incoherent sentence detection. Despite its simple setup, this task is challenging as the model needs to understand and analyze a multi-sentence narrative text, and make decisions at the sentence level. As an initial step towards this task, we implement several baselines either directly analyzing the raw text (\textit{token-level}) or analyzing learned sentence representations (\textit{sentence-level}). We observe that while token-level modeling enjoys greater expressive power and hence better performance, sentence-level modeling possesses an advantage in efficiency and flexibility. With pre-training on large-scale data and cycle-consistent sentence embedding, our extended sentence-level model can achieve comparable detection accuracy to the token-level model. As a by-product, such a strategy enables simultaneous incoherence detection and infilling/modification suggestions.

Via

Access Paper or Ask Questions

Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond

Oct 23, 2020

Xin Li, Lidong Bing, Wenxuan Zhang, Zheng Li, Wai Lam

Figure 1 for Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond

Figure 2 for Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond

Figure 3 for Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond

Figure 4 for Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond

Abstract:Cross-lingual adaptation with multilingual pre-trained language models (mPTLMs) mainly consists of two lines of works: zero-shot approach and translation-based approach, which have been studied extensively on the sequence-level tasks. We further verify the efficacy of these cross-lingual adaptation approaches by evaluating their performances on more fine-grained sequence tagging tasks. After re-examining their strengths and drawbacks, we propose a novel framework to consolidate the zero-shot approach and the translation-based approach for better adaptation performance. Instead of simply augmenting the source data with the machine-translated data, we tailor-make a warm-up mechanism to quickly update the mPTLMs with the gradients estimated on a few translated data. Then, the adaptation approach is applied to the refined parameters and the cross-lingual transfer is performed in a warm-start way. The experimental results on nine target languages demonstrate that our method is beneficial to the cross-lingual adaptation of various sequence tagging tasks.

Via

Access Paper or Ask Questions

Multi-hop Inference for Question-driven Summarization

Oct 08, 2020

Yang Deng, Wenxuan Zhang, Wai Lam

Figure 1 for Multi-hop Inference for Question-driven Summarization

Figure 2 for Multi-hop Inference for Question-driven Summarization

Figure 3 for Multi-hop Inference for Question-driven Summarization

Figure 4 for Multi-hop Inference for Question-driven Summarization

Abstract:Question-driven summarization has been recently studied as an effective approach to summarizing the source document to produce concise but informative answers for non-factoid questions. In this work, we propose a novel question-driven abstractive summarization method, Multi-hop Selective Generator (MSG), to incorporate multi-hop reasoning into question-driven summarization and, meanwhile, provide justifications for the generated summaries. Specifically, we jointly model the relevance to the question and the interrelation among different sentences via a human-like multi-hop inference module, which captures important sentences for justifying the summarized answer. A gated selective pointer generator network with a multi-view coverage mechanism is designed to integrate diverse information from different perspectives. Experimental results show that the proposed method consistently outperforms state-of-the-art methods on two non-factoid QA datasets, namely WikiHow and PubMedQA.

* Accepted by EMNLP 2020 (main conference, long paper)

Via

Access Paper or Ask Questions

Partially-Aligned Data-to-Text Generation with Distant Supervision

Oct 03, 2020

Zihao Fu, Bei Shi, Wai Lam, Lidong Bing, Zhiyuan Liu

Figure 1 for Partially-Aligned Data-to-Text Generation with Distant Supervision

Figure 2 for Partially-Aligned Data-to-Text Generation with Distant Supervision

Figure 3 for Partially-Aligned Data-to-Text Generation with Distant Supervision

Figure 4 for Partially-Aligned Data-to-Text Generation with Distant Supervision

Abstract:The Data-to-Text task aims to generate human-readable text for describing some given structured data enabling more interpretability. However, the typical generation task is confined to a few particular domains since it requires well-aligned data which is difficult and expensive to obtain. Using partially-aligned data is an alternative way of solving the dataset scarcity problem. This kind of data is much easier to obtain since it can be produced automatically. However, using this kind of data induces the over-generation problem posing difficulties for existing models, which tends to add unrelated excerpts during the generation procedure. In order to effectively utilize automatically annotated partially-aligned datasets, we extend the traditional generation task to a refined task called Partially-Aligned Data-to-Text Generation (PADTG) which is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains. To tackle this new task, we propose a novel distant supervision generation framework. It firstly estimates the input data's supportiveness for each target word with an estimator and then applies a supportiveness adaptor and a rebalanced beam search to harness the over-generation problem in the training and generation phases respectively. We also contribute a partially-aligned dataset (The data and source code of this paper can be obtained from https://github.com/fuzihaofzh/distant_supervision_nlg by sampling sentences from Wikipedia and automatically extracting corresponding KB triples for each sentence from Wikidata. The experimental results show that our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.

* To appear EMNLP 2020. 11 pages

Via

Access Paper or Ask Questions

Enhancing Dialogue Generation via Multi-Level Contrastive Learning

Sep 19, 2020

Xin Li, Piji Li, Yan Wang, Xiaojiang Liu, Wai Lam

Figure 1 for Enhancing Dialogue Generation via Multi-Level Contrastive Learning

Figure 2 for Enhancing Dialogue Generation via Multi-Level Contrastive Learning

Figure 3 for Enhancing Dialogue Generation via Multi-Level Contrastive Learning

Figure 4 for Enhancing Dialogue Generation via Multi-Level Contrastive Learning

Abstract:Most of the existing works for dialogue generation are data-driven models trained directly on corpora crawled from websites. They mainly focus on improving the model architecture to produce better responses but pay little attention to considering the quality of the training data contrastively. In this paper, we propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query. A Rank-aware Calibration (RC) network is designed to construct the multi-level contrastive optimization objectives. Since these objectives are calculated based on the sentence level, which may erroneously encourage/suppress the generation of uninformative/informative words. To tackle this incidental issue, on one hand, we design an exquisite token-level strategy for estimating the instance loss more accurately. On the other hand, we build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words. We evaluate the proposed model on a carefully annotated dialogue dataset and the results suggest that our model can generate more relevant and diverse responses compared to the baseline models.

Via

Access Paper or Ask Questions

Opinion-aware Answer Generation for Review-driven Question Answering in E-Commerce

Aug 28, 2020

Yang Deng, Wenxuan Zhang, Wai Lam

Figure 1 for Opinion-aware Answer Generation for Review-driven Question Answering in E-Commerce

Figure 2 for Opinion-aware Answer Generation for Review-driven Question Answering in E-Commerce

Figure 3 for Opinion-aware Answer Generation for Review-driven Question Answering in E-Commerce

Figure 4 for Opinion-aware Answer Generation for Review-driven Question Answering in E-Commerce

Abstract:Product-related question answering (QA) is an important but challenging task in E-Commerce. It leads to a great demand on automatic review-driven QA, which aims at providing instant responses towards user-posted questions based on diverse product reviews. Nevertheless, the rich information about personal opinions in product reviews, which is essential to answer those product-specific questions, is underutilized in current generation-based review-driven QA studies. There are two main challenges when exploiting the opinion information from the reviews to facilitate the opinion-aware answer generation: (i) jointly modeling opinionated and interrelated information between the question and reviews to capture important information for answer generation, (ii) aggregating diverse opinion information to uncover the common opinion towards the given question. In this paper, we tackle opinion-aware answer generation by jointly learning answer generation and opinion mining tasks with a unified model. Two kinds of opinion fusion strategies, namely, static and dynamic fusion, are proposed to distill and aggregate important opinion information learned from the opinion mining task into the answer generation process. Then a multi-view pointer-generator network is employed to generate opinion-aware answers for a given product-related question. Experimental results show that our method achieves superior performance in real-world E-Commerce QA datasets, and effectively generate opinionated and informative answers.

* Accepted by CIKM 2020 (Full Paper)

Via

Access Paper or Ask Questions

Contextualized Code Representation Learning for Commit Message Generation

Jul 14, 2020

Lun Yiu Nie, Cuiyun Gao, Zhicong Zhong, Wai Lam, Yang Liu, Zenglin Xu

Figure 1 for Contextualized Code Representation Learning for Commit Message Generation

Figure 2 for Contextualized Code Representation Learning for Commit Message Generation

Figure 3 for Contextualized Code Representation Learning for Commit Message Generation

Figure 4 for Contextualized Code Representation Learning for Commit Message Generation

Abstract:Automatic generation of high-quality commit messages for code commits can substantially facilitate developers' works and coordination. However, the semantic gap between source code and natural language poses a major challenge for the task. Several studies have been proposed to alleviate the challenge but none explicitly involves code contextual information during commit message generation. Specifically, existing research adopts static embedding for code tokens, which maps a token to the same vector regardless of its context. In this paper, we propose a novel Contextualized code representation learning method for commit message Generation (CoreGen). CoreGen first learns contextualized code representation which exploits the contextual information behind code commit sequences. The learned representations of code commits built upon Transformer are then transferred for downstream commit message generation. Experiments on the benchmark dataset demonstrate the superior effectiveness of our model over the baseline models with an improvement of 28.18% in terms of BLEU-4 score. Furthermore, we also highlight the future opportunities in training contextualized code representations on larger code corpus as a solution to low-resource settings and adapting the pretrained code representations to other downstream code-to-text generation tasks.

Via

Access Paper or Ask Questions

AMR Parsing via Graph-Sequence Iterative Inference

Apr 29, 2020

Deng Cai, Wai Lam

Figure 1 for AMR Parsing via Graph-Sequence Iterative Inference

Figure 2 for AMR Parsing via Graph-Sequence Iterative Inference

Figure 3 for AMR Parsing via Graph-Sequence Iterative Inference

Figure 4 for AMR Parsing via Graph-Sequence Iterative Inference

Abstract:We propose a new end-to-end model that treats AMR parsing as a series of dual decisions on the input sequence and the incrementally constructed graph. At each time step, our model performs multiple rounds of attention, reasoning, and composition that aim to answer two critical questions: (1) which part of the input \textit{sequence} to abstract; and (2) where in the output \textit{graph} to construct the new concept. We show that the answers to these two questions are mutually causalities. We design a model based on iterative inference that helps achieve better answers in both perspectives, leading to greatly improved parsing accuracy. Our experimental results significantly outperform all previously reported \textsc{Smatch} scores by large margins. Remarkably, without the help of any large-scale pre-trained language model (e.g., BERT), our model already surpasses previous state-of-the-art using BERT. With the help of BERT, we can push the state-of-the-art results to 80.2\% on LDC2017T10 (AMR 2.0) and 75.4\% on LDC2014T12 (AMR 1.0).

* ACL2020

Via

Access Paper or Ask Questions

Context-aware Helpfulness Prediction for Online Product Reviews

Apr 27, 2020

Iyiola E. Olatunji, Xin Li, Wai Lam

Figure 1 for Context-aware Helpfulness Prediction for Online Product Reviews

Figure 2 for Context-aware Helpfulness Prediction for Online Product Reviews

Figure 3 for Context-aware Helpfulness Prediction for Online Product Reviews

Figure 4 for Context-aware Helpfulness Prediction for Online Product Reviews

Abstract:Modeling and prediction of review helpfulness has become more predominant due to proliferation of e-commerce websites and online shops. Since the functionality of a product cannot be tested before buying, people often rely on different kinds of user reviews to decide whether or not to buy a product. However, quality reviews might be buried deep in the heap of a large amount of reviews. Therefore, recommending reviews to customers based on the review quality is of the essence. Since there is no direct indication of review quality, most reviews use the information that ''X out of Y'' users found the review helpful for obtaining the review quality. However, this approach undermines helpfulness prediction because not all reviews have statistically abundant votes. In this paper, we propose a neural deep learning model that predicts the helpfulness score of a review. This model is based on convolutional neural network (CNN) and a context-aware encoding mechanism which can directly capture relationships between words irrespective of their distance in a long sequence. We validated our model on human annotated dataset and the result shows that our model significantly outperforms existing models for helpfulness prediction.

* Published as a proceeding paper in AIRS 2019

Via

Access Paper or Ask Questions

Salience Estimation with Multi-Attention Learning for Abstractive Text Summarization

Apr 07, 2020

Piji Li, Lidong Bing, Zhongyu Wei, Wai Lam

Figure 1 for Salience Estimation with Multi-Attention Learning for Abstractive Text Summarization

Figure 2 for Salience Estimation with Multi-Attention Learning for Abstractive Text Summarization

Figure 3 for Salience Estimation with Multi-Attention Learning for Abstractive Text Summarization

Figure 4 for Salience Estimation with Multi-Attention Learning for Abstractive Text Summarization

Abstract:Attention mechanism plays a dominant role in the sequence generation models and has been used to improve the performance of machine translation and abstractive text summarization. Different from neural machine translation, in the task of text summarization, salience estimation for words, phrases or sentences is a critical component, since the output summary is a distillation of the input text. Although the typical attention mechanism can conduct text fragment selection from the input text conditioned on the decoder states, there is still a gap to conduct direct and effective salience detection. To bring back direct salience estimation for summarization with neural networks, we propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation: supervised attention learning and unsupervised attention learning. We regard the attention weights as the salience information, which means that the semantic units with large attention value will be more important. The context information obtained based on the estimated salience is incorporated with the typical attention mechanism in the decoder to conduct summary generation. Extensive experiments on some benchmark datasets in different languages demonstrate the effectiveness of the proposed framework for the task of abstractive summarization.

* 11 pages, @CUHK. arXiv admin note: text overlap with arXiv:1803.11070, arXiv:1708.00625

Via

Access Paper or Ask Questions