Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

How to "DODGE" Complex Software Analytics?

Feb 05, 2019
Amritanshu Agrawal, Wei Fu, Di Chen, Xipeng Shen, Tim Menzies

AI software is still software. Software engineers need better tools to make better use of AI software. For example, for software defect prediction and software text mining, the default tunings for software analytics tools can be improved with "hyperparameter optimization" tools that decide (e.g.,) how many trees are needed in a random forest. Hyperparameter optimization is unnecessarily slow when optimizers waste time exploring redundant options (i.e., pairs of tunings with indistinguishably different results). By ignoring redundant tunings, the Dodge(E) hyperparameter optimization tool can run orders of magnitude faster, yet still find better tunings than prior state-of-the-art algorithms (for software defect prediction and software text mining).

* 12 Pages, Submitted to TSE Jorunal 

  Access Paper or Ask Questions

Dynamic Integration of Background Knowledge in Neural NLU Systems

Aug 21, 2018
Dirk Weissenborn, Tomáš Kočiský, Chris Dyer

Common-sense and background knowledge is required to understand natural language, but in most neural natural language understanding (NLU) systems, this knowledge must be acquired from training corpora during learning, and then it is static at test time. We introduce a new architecture for the dynamic integration of explicit background knowledge in NLU models. A general-purpose reading module reads background knowledge in the form of free-text statements (together with task-specific text inputs) and yields refined word representations to a task-specific NLU architecture that reprocesses the task inputs with these representations. Experiments on document question answering (DQA) and recognizing textual entailment (RTE) demonstrate the effectiveness and flexibility of the approach. Analysis shows that our model learns to exploit knowledge in a semantically appropriate way.

  Access Paper or Ask Questions

Sentence Ordering and Coherence Modeling using Recurrent Neural Networks

Dec 22, 2017
Lajanugen Logeswaran, Honglak Lee, Dragomir Radev

Modeling the structure of coherent texts is a key NLP problem. The task of coherently organizing a given set of sentences has been commonly used to build and evaluate models that understand such structure. We propose an end-to-end unsupervised deep learning approach based on the set-to-sequence framework to address this problem. Our model strongly outperforms prior methods in the order discrimination task and a novel task of ordering abstracts from scientific articles. Furthermore, our work shows that useful text representations can be obtained by learning to order sentences. Visualizing the learned sentence representations shows that the model captures high-level logical structure in paragraphs. Our representations perform comparably to state-of-the-art pre-training methods on sentence similarity and paraphrase detection tasks.

  Access Paper or Ask Questions

Reading Wikipedia to Answer Open-Domain Questions

Apr 28, 2017
Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes

This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. This task of machine reading at scale combines the challenges of document retrieval (finding the relevant articles) with that of machine comprehension of text (identifying the answer spans from those articles). Our approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA datasets indicate that (1) both modules are highly competitive with respect to existing counterparts and (2) multitask learning using distant supervision on their combination is an effective complete system on this challenging task.

* ACL2017, 10 pages 

  Access Paper or Ask Questions

Detecting Context Dependent Messages in a Conversational Environment

Nov 03, 2016
Chaozhuo Li, Yu Wu, Wei Wu, Chen Xing, Zhoujun Li, Ming Zhou

While automatic response generation for building chatbot systems has drawn a lot of attention recently, there is limited understanding on when we need to consider the linguistic context of an input text in the generation process. The task is challenging, as messages in a conversational environment are short and informal, and evidence that can indicate a message is context dependent is scarce. After a study of social conversation data crawled from the web, we observed that some characteristics estimated from the responses of messages are discriminative for identifying context dependent messages. With the characteristics as weak supervision, we propose using a Long Short Term Memory (LSTM) network to learn a classifier. Our method carries out text representation and classifier learning in a unified framework. Experimental results show that the proposed method can significantly outperform baseline methods on accuracy of classification.

* Accepted paper at COLING 2016 

  Access Paper or Ask Questions

Knowledge Transfer with Medical Language Embeddings

Feb 10, 2016
Stephanie L. Hyland, Theofanis Karaletsos, Gunnar Rätsch

Identifying relationships between concepts is a key aspect of scientific knowledge synthesis. Finding these links often requires a researcher to laboriously search through scien- tific papers and databases, as the size of these resources grows ever larger. In this paper we describe how distributional semantics can be used to unify structured knowledge graphs with unstructured text to predict new relationships between medical concepts, using a probabilistic generative model. Our approach is also designed to ameliorate data sparsity and scarcity issues in the medical domain, which make language modelling more challenging. Specifically, we integrate the medical relational database (SemMedDB) with text from electronic health records (EHRs) to perform knowledge graph completion. We further demonstrate the ability of our model to predict relationships between tokens not appearing in the relational database.

* 6 pages, 2 figures, to appear at SDM-DMMH 2016 

  Access Paper or Ask Questions

Statistical sentiment analysis performance in Opinum

Mar 03, 2013
Boyan Bonev, Gema Ramírez-Sánchez, Sergio Ortiz Rojas

The classification of opinion texts in positive and negative is becoming a subject of great interest in sentiment analysis. The existence of many labeled opinions motivates the use of statistical and machine-learning methods. First-order statistics have proven to be very limited in this field. The Opinum approach is based on the order of the words without using any syntactic and semantic information. It consists of building one probabilistic model for the positive and another one for the negative opinions. Then the test opinions are compared to both models and a decision and confidence measure are calculated. In order to reduce the complexity of the training corpus we first lemmatize the texts and we replace most named-entities with wildcards. Opinum presents an accuracy above 81% for Spanish opinions in the financial products domain. In this work we discuss which are the most important factors that have impact on the classification performance.

  Access Paper or Ask Questions

Inference and Plausible Reasoning in a Natural Language Understanding System Based on Object-Oriented Semantics

Feb 01, 2012
Yuriy Ostapov

Algorithms of inference in a computer system oriented to input and semantic processing of text information are presented. Such inference is necessary for logical questions when the direct comparison of objects from a question and database can not give a result. The following classes of problems are considered: a check of hypotheses for persons and non-typical actions, the determination of persons and circumstances for non-typical actions, planning actions, the determination of event cause and state of persons. To form an answer both deduction and plausible reasoning are used. As a knowledge domain under consideration is social behavior of persons, plausible reasoning is based on laws of social psychology. Proposed algorithms of inference and plausible reasoning can be realized in computer systems closely connected with text processing (criminology, operation of business, medicine, document systems).

  Access Paper or Ask Questions

A Model of Lexical Attraction and Repulsion

Jun 16, 1997
Doug Beeferman, Adam Berger, John Lafferty

This paper introduces new methods based on exponential families for modeling the correlations between words in text and speech. While previous work assumed the effects of word co-occurrence statistics to be constant over a window of several hundred words, we show that their influence is nonstationary on a much smaller time scale. Empirical data drawn from English and Japanese text, as well as conversational speech, reveals that the ``attraction'' between words decays exponentially, while stylistic and syntactic contraints create a ``repulsion'' between words that discourages close co-occurrence. We show that these characteristics are well described by simple mixture models based on two-stage exponential distributions which can be trained using the EM algorithm. The resulting distance distributions can then be incorporated as penalizing features in an exponential language model.

* 8 pages, LaTeX source and postscript figures for ACL/EACL'97 paper 

  Access Paper or Ask Questions

Contrastive Document Representation Learning with Graph Attention Networks

Oct 20, 2021
Peng Xu, Xinchi Chen, Xiaofei Ma, Zhiheng Huang, Bing Xiang

Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.

* Findings of EMNLP 2021 

  Access Paper or Ask Questions