Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daya Guo

Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder

Jun 15, 2020
Daya Guo, Duyu Tang, Nan Duan, Jian Yin, Daxin Jiang, Ming Zhou

Figure 1 for Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder

Figure 2 for Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder

Figure 3 for Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder

Figure 4 for Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder

Generating inferential texts about an event in different perspectives requires reasoning over different contexts that the event occurs. Existing works usually ignore the context that is not explicitly provided, resulting in a context-independent semantic representation that struggles to support the generation. To address this, we propose an approach that automatically finds evidence for an event from a large text corpus, and leverages the evidence to guide the generation of inferential texts. Our approach works in an encoder-decoder manner and is equipped with a Vector Quantised-Variational Autoencoder, where the encoder outputs representations from a distribution over discrete variables. Such discrete representations enable automatically selecting relevant evidence, which not only facilitates evidence-aware generation, but also provides a natural way to uncover rationales behind the generation. Our approach provides state-of-the-art performance on both Event2Mind and ATOMIC datasets. More importantly, we find that with discrete representations, our model selectively uses evidence to generate different inferential texts.

* Accepted by ACL 2020

Via

Access Paper or Ask Questions

Inferential Text Generation with Multiple Knowledge Sources and Meta-Learning

Apr 15, 2020
Daya Guo, Akari Asai, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Jian Yin, Ming Zhou

Figure 1 for Inferential Text Generation with Multiple Knowledge Sources and Meta-Learning

Figure 2 for Inferential Text Generation with Multiple Knowledge Sources and Meta-Learning

Figure 3 for Inferential Text Generation with Multiple Knowledge Sources and Meta-Learning

Figure 4 for Inferential Text Generation with Multiple Knowledge Sources and Meta-Learning

We study the problem of generating inferential texts of events for a variety of commonsense like \textit{if-else} relations. Existing approaches typically use limited evidence from training examples and learn for each relation individually. In this work, we use multiple knowledge sources as fuels for the model. Existing commonsense knowledge bases like ConceptNet are dominated by taxonomic knowledge (e.g., \textit{isA} and \textit{relatedTo} relations), having a limited number of inferential knowledge. We use not only structured commonsense knowledge bases, but also natural language snippets from search-engine results. These sources are incorporated into a generative base model via key-value memory network. In addition, we introduce a meta-learning based multi-task learning algorithm. For each targeted commonsense relation, we regard the learning of examples from other relations as the meta-training process, and the evaluation on examples from the targeted relation as the meta-test process. We conduct experiments on Event2Mind and ATOMIC datasets. Results show that both the integration of multiple knowledge sources and the use of the meta-learning algorithm improve the performance.

Via

Access Paper or Ask Questions

Pre-training Text Representations as Meta Learning

Apr 12, 2020
Shangwen Lv, Yuechen Wang, Daya Guo, Duyu Tang, Nan Duan, Fuqing Zhu, Ming Gong, Linjun Shou, Ryan Ma, Daxin Jiang, Guihong Cao, Ming Zhou, Songlin Hu

Figure 1 for Pre-training Text Representations as Meta Learning

Figure 2 for Pre-training Text Representations as Meta Learning

Figure 3 for Pre-training Text Representations as Meta Learning

Figure 4 for Pre-training Text Representations as Meta Learning

Pre-training text representations has recently been shown to significantly improve the state-of-the-art in many natural language processing tasks. The central goal of pre-training is to learn text representations that are useful for subsequent tasks. However, existing approaches are optimized by minimizing a proxy objective, such as the negative log likelihood of language modeling. In this work, we introduce a learning algorithm which directly optimizes model's ability to learn text representations for effective learning of downstream tasks. We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps. The standard multi-task learning objective adopted in BERT is a special case of our learning algorithm where the depth of meta-train is zero. We study the problem in two settings: unsupervised pre-training and supervised pre-training with different pre-training objects to verify the generality of our approach.Experimental results show that our algorithm brings improvements and learns better initializations for a variety of downstream tasks.

* 2 figures, 3 tables

Via

Access Paper or Ask Questions

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Feb 19, 2020
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou

Figure 1 for CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Figure 2 for CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Figure 3 for CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Figure 4 for CodeBERT: A Pre-Trained Model for Programming and Natural Languages

We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language codesearch, code documentation generation, etc. We develop CodeBERT with Transformer-based neural architecture, and train it with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators. This enables us to utilize both bimodal data of NL-PL pairs and unimodal data, where the former provides input tokens for model training while the latter helps to learn better generators. We evaluate CodeBERT on two NL-PL applications by fine-tuning model parameters. Results show that CodeBERT achieves state-of-the-art performance on both natural language code search and code documentation generation tasks. Furthermore, to investigate what type of knowledge is learned in CodeBERT, we construct a dataset for NL-PL probing, and evaluate in a zero-shot setting where parameters of pre-trained models are fixed. Results show that CodeBERT performs better than previous pre-trained models on NL-PL probing.

* 10 pages

Via

Access Paper or Ask Questions

Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

Oct 11, 2019
Tao Shen, Xiubo Geng, Tao Qin, Daya Guo, Duyu Tang, Nan Duan, Guodong Long, Daxin Jiang

Figure 1 for Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

Figure 2 for Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

Figure 3 for Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

Figure 4 for Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

We consider the problem of conversational question answering over a large-scale knowledge base. To handle huge entity vocabulary of a large-scale knowledge base, recent neural semantic parsing based approaches usually decompose the task into several subtasks and then solve them sequentially, which leads to following issues: 1) errors in earlier subtasks will be propagated and negatively affect downstream ones; and 2) each subtask cannot naturally share supervision signals with others. To tackle these issues, we propose an innovative multi-task learning framework where a pointer-equipped semantic parsing model is designed to resolve coreference in conversations, and naturally empower joint learning with a novel type-aware entity detection model. The proposed framework thus enables shared supervisions and alleviates the effect of error propagation. Experiments on a large-scale conversational question answering dataset containing 1.6M question answering pairs over 12.8M entities show that the proposed framework improves overall F1 score from 67% to 79% compared with previous state-of-the-art work.

* Accepted to appear at EMNLP-IJCNLP 2019

Via

Access Paper or Ask Questions

Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering

Sep 09, 2019
Shangwen Lv, Daya Guo, Jingjing Xu, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Songlin Hu

Figure 1 for Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering

Figure 2 for Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering

Figure 3 for Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering

Figure 4 for Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering

Commonsense question answering aims to answer questions which require background knowledge that is not explicitly expressed in the question. The key challenge is how to obtain evidence from external knowledge and make predictions based on the evidence. Recent works either learn to generate evidence from human-annotated evidence which is expensive to collect, or extract evidence from either structured or unstructured knowledge bases which fails to take advantages of both sources. In this work, we propose to automatically extract evidence from heterogeneous knowledge sources, and answer questions based on the extracted evidence. Specifically, we extract evidence from both structured knowledge base (i.e. ConceptNet) and Wikipedia plain texts. We construct graphs for both sources to obtain the relational structures of evidence. Based on these graphs, we propose a graph-based approach consisting of a graph-based contextual word representation learning module and a graph-based inference module. The first module utilizes graph structural information to re-define the distance between words for learning better contextual word representations. The second module adopts graph convolutional network to encode neighbor information into the representations of nodes, and aggregates evidence with graph attention mechanism for predicting the final answer. Experimental results on CommonsenseQA dataset illustrate that our graph-based approach over both knowledge sources brings improvement over strong baselines. Our approach achieves the state-of-the-art accuracy (75.3%) on the CommonsenseQA leaderboard.

* 8 pages, 5 figure

Via

Access Paper or Ask Questions

Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing

Jun 17, 2019
Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, Jian Yin

Figure 1 for Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing

Figure 2 for Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing

Figure 3 for Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing

Figure 4 for Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing

In this paper, we present an approach to incorporate retrieved datapoints as supporting evidence for context-dependent semantic parsing, such as generating source code conditioned on the class environment. Our approach naturally combines a retrieval model and a meta-learner, where the former learns to find similar datapoints from the training data, and the latter considers retrieved datapoints as a pseudo task for fast adaptation. Specifically, our retriever is a context-aware encoder-decoder model with a latent variable which takes context environment into consideration, and our meta-learner learns to utilize retrieved datapoints in a model-agnostic meta-learning paradigm for fast adaptation. We conduct experiments on CONCODE and CSQA datasets, where the context refers to class environment in JAVA codes and conversational history, respectively. We use sequence-to-action model as the base semantic parser, which performs the state-of-the-art accuracy on both datasets. Results show that both the context-aware retriever and the meta-learning strategy improve accuracy, and our approach performs better than retrieve-and-edit baselines.

* Accepted by ACL 2019

Via

Access Paper or Ask Questions

Knowledge Based Machine Reading Comprehension

Sep 12, 2018
Yibo Sun, Daya Guo, Duyu Tang, Nan Duan, Zhao Yan, Xiaocheng Feng, Bing Qin

Figure 1 for Knowledge Based Machine Reading Comprehension

Figure 2 for Knowledge Based Machine Reading Comprehension

Figure 3 for Knowledge Based Machine Reading Comprehension

Figure 4 for Knowledge Based Machine Reading Comprehension

Machine reading comprehension (MRC) requires reasoning about both the knowledge involved in a document and knowledge about the world. However, existing datasets are typically dominated by questions that can be well solved by context matching, which fail to test this capability. To encourage the progress on knowledge-based reasoning in MRC, we present knowledge-based MRC in this paper, and build a new dataset consisting of 40,047 question-answer pairs. The annotation of this dataset is designed so that successfully answering the questions requires understanding and the knowledge involved in a document. We implement a framework consisting of both a question answering model and a question generation model, both of which take the knowledge extracted from the document as well as relevant facts from an external knowledge base such as Freebase/ProBase/Reverb/NELL. Results show that incorporating side information from external KB improves the accuracy of the baseline question answer system. We compare it with a standard MRC model BiDAF, and also provide the difficulty of the dataset and lay out remaining challenges.

Via

Access Paper or Ask Questions

Question Generation from SQL Queries Improves Neural Semantic Parsing

Aug 27, 2018
Daya Guo, Yibo Sun, Duyu Tang, Nan Duan, Jian Yin, Hong Chi, James Cao, Peng Chen, Ming Zhou

Figure 1 for Question Generation from SQL Queries Improves Neural Semantic Parsing

Figure 2 for Question Generation from SQL Queries Improves Neural Semantic Parsing

Figure 3 for Question Generation from SQL Queries Improves Neural Semantic Parsing

Figure 4 for Question Generation from SQL Queries Improves Neural Semantic Parsing

We study how to learn a semantic parser of state-of-the-art accuracy with less supervised training data. We conduct our study on WikiSQL, the largest hand-annotated semantic parsing dataset to date. First, we demonstrate that question generation is an effective method that empowers us to learn a state-of-the-art neural network based semantic parser with thirty percent of the supervised training data. Second, we show that applying question generation to the full supervised training data further improves the state-of-the-art model. In addition, we observe that there is a logarithmic relationship between the accuracy of a semantic parser and the amount of training data.

* The paper will be presented in EMNLP 2018

Via

Access Paper or Ask Questions