Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization

Sep 09, 2021
Alireza Salemi, Emad Kebriaei, Ghazal Neisi Minaei, Azadeh Shakery

Abstractive text summarization is one of the areas influenced by the emergence of pre-trained language models. Current pre-training works in abstractive summarization give more points to the summaries with more words in common with the main text and pay less attention to the semantic similarity between generated sentences and the original document. We propose ARMAN, a Transformer-based encoder-decoder model pre-trained with three novel objectives to address this issue. In ARMAN, salient sentences from a document are selected according to a modified semantic score to be masked and form a pseudo summary. To summarize more accurately and similar to human writing patterns, we applied modified sentence reordering. We evaluated our proposed models on six downstream Persian summarization tasks. Experimental results show that our proposed model achieves state-of-the-art performance on all six summarization tasks measured by ROUGE and BERTScore. Our models also outperform prior works in textual entailment, question paraphrasing, and multiple choice question answering. Finally, we established a human evaluation and show that using the semantic score significantly improves summarization results.

  Access Paper or Ask Questions

NumGPT: Improving Numeracy Ability of Generative Pre-trained Models

Sep 07, 2021
Zhihua Jin, Xin Jiang, Xingbo Wang, Qun Liu, Yong Wang, Xiaozhe Ren, Huamin Qu

Existing generative pre-trained language models (e.g., GPT) focus on modeling the language structure and semantics of general texts. However, those models do not consider the numerical properties of numbers and cannot perform robustly on numerical reasoning tasks (e.g., math word problems and measurement estimation). In this paper, we propose NumGPT, a generative pre-trained model that explicitly models the numerical properties of numbers in texts. Specifically, it leverages a prototype-based numeral embedding to encode the mantissa of the number and an individual embedding to encode the exponent of the number. A numeral-aware loss function is designed to integrate numerals into the pre-training objective of NumGPT. We conduct extensive experiments on four different datasets to evaluate the numeracy ability of NumGPT. The experiment results show that NumGPT outperforms baseline models (e.g., GPT and GPT with DICE) on a range of numerical reasoning tasks such as measurement estimation, number comparison, math word problems, and magnitude classification. Ablation studies are also conducted to evaluate the impact of pre-training and model hyperparameters on the performance.

* 8 pages, 3 figures 

  Access Paper or Ask Questions

Learning Knowledge Graph-based World Models of Textual Environments

Jun 17, 2021
Prithviraj Ammanabrolu, Mark O. Riedl

World models improve a learning agent's ability to efficiently operate in interactive and situated environments. This work focuses on the task of building world models of text-based game environments. Text-based games, or interactive narratives, are reinforcement learning environments in which agents perceive and interact with the world using textual natural language. These environments contain long, multi-step puzzles or quests woven through a world that is filled with hundreds of characters, locations, and objects. Our world model learns to simultaneously: (1) predict changes in the world caused by an agent's actions when representing the world as a knowledge graph; and (2) generate the set of contextually relevant natural language actions required to operate in the world. We frame this task as a Set of Sequences generation problem by exploiting the inherent structure of knowledge graphs and actions and introduce both a transformer-based multi-task architecture and a loss function to train it. A zero-shot ablation study on never-before-seen textual worlds shows that our methodology significantly outperforms existing textual world modeling techniques as well as the importance of each of our contributions.

* Preprint. Under review 

  Access Paper or Ask Questions

Learning Relation Alignment for Calibrated Cross-modal Retrieval

Jun 01, 2021
Shuhuai Ren, Junyang Lin, Guangxiang Zhao, Rui Men, An Yang, Jingren Zhou, Xu Sun, Hongxia Yang

Despite the achievements of large-scale multimodal pre-training approaches, cross-modal retrieval, e.g., image-text retrieval, remains a challenging task. To bridge the semantic gap between the two modalities, previous studies mainly focus on word-region alignment at the object level, lacking the matching between the linguistic relation among the words and the visual relation among the regions. The neglect of such relation consistency impairs the contextualized representation of image-text pairs and hinders the model performance and the interpretability. In this paper, we first propose a novel metric, Intra-modal Self-attention Distance (ISD), to quantify the relation consistency by measuring the semantic distance between linguistic and visual relations. In response, we present Inter-modal Alignment on Intra-modal Self-attentions (IAIS), a regularized training method to optimize the ISD and calibrate intra-modal self-attentions from the two modalities mutually via inter-modal alignment. The IAIS regularizer boosts the performance of prevailing models on Flickr30k and MS COCO datasets by a considerable margin, which demonstrates the superiority of our approach.

* Accepted by ACL-IJCNLP 2021 main conference (Long Paper) 

  Access Paper or Ask Questions

Conversational Entity Linking: Problem Definition and Datasets

May 11, 2021
Hideaki Joko, Faegheh Hasibi, Krisztian Balog, Arjen P. de Vries

Machine understanding of user utterances in conversational systems is of utmost importance for enabling engaging and meaningful conversations with users. Entity Linking (EL) is one of the means of text understanding, with proven efficacy for various downstream tasks in information retrieval. In this paper, we study entity linking for conversational systems. To develop a better understanding of what EL in a conversational setting entails, we analyze a large number of dialogues from existing conversational datasets and annotate references to concepts, named entities, and personal entities using crowdsourcing. Based on the annotated dialogues, we identify the main characteristics of conversational entity linking. Further, we report on the performance of traditional EL systems on our Conversational Entity Linking dataset, ConEL, and present an extension to these methods to better fit the conversational setting. The resources released with this paper include annotated datasets, detailed descriptions of crowdsourcing setups, as well as the annotations produced by various EL systems. These new resources allow for an investigation of how the role of entities in conversations is different from that in documents or isolated short text utterances like queries and tweets, and complement existing conversational datasets.

  Access Paper or Ask Questions

Narrative Incoherence Detection

Dec 21, 2020
Deng Cai, Yizhe Zhang, Yichen Huang, Wai Lam, Bill Dolan

Motivated by the increasing popularity of intelligent editing assistant, we introduce and investigate the task of narrative incoherence detection: Given a (corrupted) long-form narrative, decide whether there exists some semantic discrepancy in the narrative flow. Specifically, we focus on the missing sentence and incoherent sentence detection. Despite its simple setup, this task is challenging as the model needs to understand and analyze a multi-sentence narrative text, and make decisions at the sentence level. As an initial step towards this task, we implement several baselines either directly analyzing the raw text (\textit{token-level}) or analyzing learned sentence representations (\textit{sentence-level}). We observe that while token-level modeling enjoys greater expressive power and hence better performance, sentence-level modeling possesses an advantage in efficiency and flexibility. With pre-training on large-scale data and cycle-consistent sentence embedding, our extended sentence-level model can achieve comparable detection accuracy to the token-level model. As a by-product, such a strategy enables simultaneous incoherence detection and infilling/modification suggestions.

  Access Paper or Ask Questions

Saying No is An Art: Contextualized Fallback Responses for Unanswerable Dialogue Queries

Dec 17, 2020
Ashish Shrivastava, Kaustubh Dhole, Abhinav Bhatt, Sharvani Raghunath

Despite end-to-end neural systems making significant progress in the last decade for task-oriented as well as chit-chat based dialogue systems, most dialogue systems rely on hybrid approaches which use a combination of rule-based, retrieval and generative approaches for generating a set of ranked responses. Such dialogue systems need to rely on a fallback mechanism to respond to out-of-domain or novel user queries which are not answerable within the scope of the dialog system. While, dialog systems today rely on static and unnatural responses like "I don't know the answer to that question" or "I'm not sure about that", we design a neural approach which generates responses which are contextually aware with the user query as well as say no to the user. Such customized responses provide paraphrasing ability and contextualization as well as improve the interaction with the user and reduce dialogue monotonicity. Our simple approach makes use of rules over dependency parses and a text-to-text transformer fine-tuned on synthetic data of question-response pairs generating highly relevant, grammatical as well as diverse questions. We perform automatic and manual evaluations to demonstrate the efficacy of the system.

  Access Paper or Ask Questions

Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering

Sep 18, 2020
Shih-Ting Lin, Greg Durrett

Current methods in open-domain question answering (QA) usually employ a pipeline of first retrieving relevant documents, then applying strong reading comprehension (RC) models to that retrieved text. However, modern RC models are complex and expensive to run, so techniques to prune the space of retrieved text are critical to allow this approach to scale. In this paper, we focus on approaches which apply an intermediate sentence selection step to address this issue, and investigate the best practices for this approach. We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question. We examine trade-offs between processing speed and task performance in these two approaches, and demonstrate an ensemble module that represents a hybrid of the two. From experiments on Open-SQuAD and TriviaQA, we show that very lightweight QA models can do well at this task, but retrieval-based models are faster still. An ensemble module we describe balances between the two and generalizes well cross-domain.

  Access Paper or Ask Questions

Technology Knowledge Graph Based on Patent Data

Jun 04, 2019
Serhad Sarica, Jianxi Luo, Kristin L. Wood

The growing developments in general semantic networks (or knowledge graphs) have motivated us to build a large-scale comprehensive knowledge graph of engineering data for engineering knowledge discovery, technology search and retrieval, and artificial intelligence for engineering design and innovation. Specially, we constructed a technology knowledge graph (TKG) that covers the elemental concepts in all domains of technology and their semantic associations by mining the complete U.S. patent database from 1976. This paper presents natural language processing techniques to extract terms from massive patent texts and word embedding models to vectorize such terms and establish their semantic relationships. We report and evaluate the TKG technology knowledge graph for retrieving terms, semantic similarity and analogy. The TKG may serve as an infrastructure to support a wide range of applications, e.g., technical text summaries, search query predictions, relational knowledge discovery, and design ideation support, in the context of engineering and technology. To support such applications, we made the TKG public via an online interface for public users to retrieve technology-related terms and their relevancies.

  Access Paper or Ask Questions