Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Distantly-Supervised Neural Relation Extraction with Side Information using BERT

Apr 29, 2020
Johny Moreira, Chaina Oliveira, David Macedo, Cleber Zanchettin, Luciano Barbosa

Relation extraction (RE) consists in categorizing the relationship between entities in a sentence. A recent paradigm to develop relation extractors is Distant Supervision (DS), which allows the automatic creation of new datasets by taking an alignment between a text corpus and a Knowledge Base (KB). KBs can sometimes also provide additional information to the RE task. One of the methods that adopt this strategy is the RESIDE model, which proposes a distantly-supervised neural relation extraction using side information from KBs. Considering that this method outperformed state-of-the-art baselines, in this paper, we propose a related approach to RESIDE also using additional side information, but simplifying the sentence encoding with BERT embeddings. Through experiments, we show the effectiveness of the proposed method in Google Distant Supervision and Riedel datasets concerning the BGWA and RESIDE baseline methods. Although Area Under the Curve is decreased because of unbalanced datasets, [email protected] results have shown that the use of BERT as sentence encoding allows superior performance to baseline methods.

  Access Paper or Ask Questions

Who Wins the Game of Thrones? How Sentiments Improve the Prediction of Candidate Choice

Feb 29, 2020
Chaehan So

This paper analyzes how candidate choice prediction improves by different psychological predictors. To investigate this question, it collected an original survey dataset featuring the popular TV series "Game of Thrones". The respondents answered which character they anticipated to win in the final episode of the series, and explained their choice of the final candidate in free text from which sentiments were extracted. These sentiments were compared to feature sets derived from candidate likeability and candidate personality ratings. In our benchmarking of 10-fold cross-validation in 100 repetitions, all feature sets except the likeability ratings yielded a 10-11% improvement in accuracy on the holdout set over the base model. Treating the class imbalance with synthetic minority oversampling (SMOTE) increased holdout set performance by 20-34% but surprisingly not testing set performance. Taken together, our study provides a quantified estimation of the additional predictive value of psychological predictors. Likeability ratings were clearly outperformed by the feature sets based on personality, emotional valence, and basic emotions.

* To be published in IEEE conference proceedings: International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2020 

  Access Paper or Ask Questions

Exploring Neural Models for Parsing Natural Language into First-Order Logic

Feb 16, 2020
Hrituraj Singh, Milan Aggrawal, Balaji Krishnamurthy

Semantic parsing is the task of obtaining machine-interpretable representations from natural language text. We consider one such formal representation - First-Order Logic (FOL) and explore the capability of neural models in parsing English sentences to FOL. We model FOL parsing as a sequence to sequence mapping task where given a natural language sentence, it is encoded into an intermediate representation using an LSTM followed by a decoder which sequentially generates the predicates in the corresponding FOL formula. We improve the standard encoder-decoder model by introducing a variable alignment mechanism that enables it to align variables across predicates in the predicted FOL. We further show the effectiveness of predicting the category of FOL entity - Unary, Binary, Variables and Scoped Entities, at each decoder step as an auxiliary task on improving the consistency of generated FOL. We perform rigorous evaluations and extensive ablations. We also aim to release our code as well as large scale FOL dataset along with models to aid further research in logic-based parsing and inference in NLP.

* 11 Pages, 2 Figures 

  Access Paper or Ask Questions

Unsupervised Separation of Native and Loanwords for Malayalam and Telugu

Feb 12, 2020
Sridhama Prakhya, Deepak P

Quite often, words from one language are adopted within a different language without translation; these words appear in transliterated form in text written in the latter language. This phenomenon is particularly widespread within Indian languages where many words are loaned from English. In this paper, we address the task of identifying loanwords automatically and in an unsupervised manner, from large datasets of words from agglutinative Dravidian languages. We target two specific languages from the Dravidian family, viz., Malayalam and Telugu. Based on familiarity with the languages, we outline an observation that native words in both these languages tend to be characterized by a much more versatile stem - stem being a shorthand to denote the subword sequence formed by the first few characters of the word - than words that are loaned from other languages. We harness this observation to build an objective function and an iterative optimization formulation to optimize for it, yielding a scoring of each word's nativeness in the process. Through an extensive empirical analysis over real-world datasets from both Malayalam and Telugu, we illustrate the effectiveness of our method in quantifying nativeness effectively over available baselines for the task.

* submitted to Natural Language Engineering; 22 pages; 4 figures. arXiv admin note: text overlap with arXiv:1803.09641 

  Access Paper or Ask Questions

Learning Contextualized Document Representations for Healthcare Answer Retrieval

Feb 03, 2020
Sebastian Arnold, Betty van Aken, Paul Grundmann, Felix A. Gers, Alexander Löser

We present Contextual Discourse Vectors (CDV), a distributed document representation for efficient answer retrieval from long healthcare documents. Our approach is based on structured query tuples of entities and aspects from free text and medical taxonomies. Our model leverages a dual encoder architecture with hierarchical LSTM layers and multi-task training to encode the position of clinical entities and aspects alongside the document discourse. We use our continuous representations to resolve queries with short latency using approximate nearest neighbor search on sentence level. We apply the CDV model for retrieving coherent answer passages from nine English public health resources from the Web, addressing both patients and medical professionals. Because there is no end-to-end training data available for all application scenarios, we train our model with self-supervised data from Wikipedia. We show that our generalized model significantly outperforms several state-of-the-art baselines for healthcare passage ranking and is able to adapt to heterogeneous domains without additional fine-tuning.

* The Web Conference 2020 (WWW '20) 

  Access Paper or Ask Questions

Explaining with Counter Visual Attributes and Examples

Jan 27, 2020
Sadaf Gulshad, Arnold Smeulders

In this paper, we aim to explain the decisions of neural networks by utilizing multimodal information. That is counter-intuitive attributes and counter visual examples which appear when perturbed samples are introduced. Different from previous work on interpreting decisions using saliency maps, text, or visual patches we propose to use attributes and counter-attributes, and examples and counter-examples as part of the visual explanations. When humans explain visual decisions they tend to do so by providing attributes and examples. Hence, inspired by the way of human explanations in this paper we provide attribute-based and example-based explanations. Moreover, humans also tend to explain their visual decisions by adding counter-attributes and counter-examples to explain what is not seen. We introduce directed perturbations in the examples to observe which attribute values change when classifying the examples into the counter classes. This delivers intuitive counter-attributes and counter-examples. Our experiments with both coarse and fine-grained datasets show that attributes provide discriminating and human-understandable intuitive and counter-intuitive explanations.

* arXiv admin note: substantial text overlap with arXiv:1910.07416, arXiv:1904.08279 

  Access Paper or Ask Questions

Estimation and HAC-based Inference for Machine Learning Time Series Regressions

Dec 13, 2019
Andrii Babii, Eric Ghysels, Jonas Striaukas

Time series regression analysis in econometrics typically involves a framework relying on a set of mixing conditions to establish consistency and asymptotic normality of parameter estimates and HAC-type estimators of the residual long-run variances to conduct proper inference. This article introduces structured machine learning regressions for high-dimensional time series data using the aforementioned commonly used setting. To recognize the time series data structures we rely on the sparse-group LASSO estimator. We derive a new Fuk-Nagaev inequality for a class of $\tau$-dependent processes with heavier than Gaussian tails, nesting $\alpha$-mixing processes as a special case, and establish estimation, prediction, and inferential properties, including convergence rates of the HAC estimator for the long-run variance based on LASSO residuals. An empirical application to nowcasting US GDP growth indicates that the estimator performs favorably compared to other alternatives and that the text data can be a useful addition to more traditional numerical data.

  Access Paper or Ask Questions

Let Me Know What to Ask: Interrogative-Word-Aware Question Generation

Oct 30, 2019
Junmo Kang, Haritz Puerto San Roman, Sung-Hyon Myaeng

Question Generation (QG) is a Natural Language Processing (NLP) task that aids advances in Question Answering (QA) and conversational assistants. Existing models focus on generating a question based on a text and possibly the answer to the generated question. They need to determine the type of interrogative word to be generated while having to pay attention to the grammar and vocabulary of the question. In this work, we propose Interrogative-Word-Aware Question Generation (IWAQG), a pipelined system composed of two modules: an interrogative word classifier and a QG model. The first module predicts the interrogative word that is provided to the second module to create the question. Owing to an increased recall of deciding the interrogative words to be used for the generated questions, the proposed model achieves new state-of-the-art results on the task of QG in SQuAD, improving from 46.58 to 47.69 in BLEU-1, 17.55 to 18.53 in BLEU-4, 21.24 to 22.33 in METEOR, and from 44.53 to 46.94 in ROUGE-L.

* Accepted at 2nd Workshop on Machine Reading for Question Answering (MRQA), EMNLP 2019 

  Access Paper or Ask Questions

Query-Specific Knowledge Summarization with Entity Evolutionary Networks

Sep 29, 2019
Carl Yang, Lingrui Gan, Zongyi Wang, Jiaming Shen, Jinfeng Xiao, Jiawei Han

Given a query, unlike traditional IR that finds relevant documents or entities, in this work, we focus on retrieving both entities and their connections for insightful knowledge summarization. For example, given a query "computer vision" on a CS literature corpus, rather than returning a list of relevant entities like "cnn", "imagenet" and "svm", we are interested in the connections among them, and furthermore, the evolution patterns of such connections along particular ordinal dimensions such as time. Particularly, we hope to provide structural knowledge relevant to the query, such as "svm" is related to "imagenet" but not "cnn". Moreover, we aim to model the changing trends of the connections, such as "cnn" becomes highly related to "imagenet" after 2010, which enables the tracking of knowledge evolutions. In this work, to facilitate such a novel insightful search system, we propose \textsc{SetEvolve}, which is a unified framework based on nonparanomal graphical models for evolutionary network construction from large text corpora. Systematic experiments on synthetic data and insightful case studies on real-world corpora demonstrate the utility of \textsc{SetEvolve}.

* published in CIKM 2019 

  Access Paper or Ask Questions

OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction

Sep 28, 2019
Xu Han, Tianyu Gao, Yuan Yao, Demin Ye, Zhiyuan Liu, Maosong Sun

OpenNRE is an open-source and extensible toolkit that provides a unified framework to implement neural models for relation extraction (RE). Specifically, by implementing typical RE methods, OpenNRE not only allows developers to train custom models to extract structured relational facts from the plain text but also supports quick model validation for researchers. Besides, OpenNRE provides various functional RE modules based on both TensorFlow and PyTorch to maintain sufficient modularity and extensibility, making it becomes easy to incorporate new models into the framework. Besides the toolkit, we also release an online system to meet real-time extraction without any training and deploying. Meanwhile, the online system can extract facts in various scenarios as well as aligning the extracted facts to Wikidata, which may benefit various downstream knowledge-driven applications (e.g., information retrieval and question answering). More details of the toolkit and online system can be obtained from

  Access Paper or Ask Questions