Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

A Multivariate Model for Representing Semantic Non-compositionality

Aug 15, 2019
Meghdad Farahmand

Semantically non-compositional phrases constitute an intriguing research topic in Natural Language Processing. Semantic non-compositionality --the situation when the meaning of a phrase cannot be derived from the meaning of its components, is the main characteristic of such phrases, however, they bear other characteristics such as high statistical association and non-substitutability. In this work, we present a model for identifying non-compositional phrases that takes into account all of these characteristics. We show that the presented model remarkably outperforms the existing models of identifying non-compositional phrases that mostly focus only on one of these characteristics.

* 11 content pages, 10 figures 

  Access Paper or Ask Questions

Issue Framing in Online Discussion Fora

Apr 09, 2019
Mareike Hartmann, Tallulah Jansen, Isabelle Augenstein, Anders Søgaard

In online discussion fora, speakers often make arguments for or against something, say birth control, by highlighting certain aspects of the topic. In social science, this is referred to as issue framing. In this paper, we introduce a new issue frame annotated corpus of online discussions. We explore to what extent models trained to detect issue frames in newswire and social media can be transferred to the domain of discussion fora, using a combination of multi-task and adversarial training, assuming only unlabeled training data in the target domain.

* To appear in NAACL-HLT 2019 

  Access Paper or Ask Questions

Extractive Summarization of EHR Discharge Notes

Oct 26, 2018
Emily Alsentzer, Anne Kim

Patient summarization is essential for clinicians to provide coordinated care and practice effective communication. Automated summarization has the potential to save time, standardize notes, aid clinical decision making, and reduce medical errors. Here we provide an upper bound on extractive summarization of discharge notes and develop an LSTM model to sequentially label topics of history of present illness notes. We achieve an F1 score of 0.876, which indicates that this model can be employed to create a dataset for evaluation of extractive summarization methods.

  Access Paper or Ask Questions

Self-Organization and Artificial Life: A Review

Apr 03, 2018
Carlos Gershenson, Vito Trianni, Justin Werfel, Hiroki Sayama

Self-organization has been an important concept within a number of disciplines, which Artificial Life (ALife) also has heavily utilized since its inception. The term and its implications, however, are often confusing or misinterpreted. In this work, we provide a mini-review of self-organization and its relationship with ALife, aiming at initiating discussions on this important topic with the interested audience. We first articulate some fundamental aspects of self-organization, outline its usage, and review its applications to ALife within its soft, hard, and wet domains. We also provide perspectives for further research.

* 8 pages, submitted to ALife 2018 

  Access Paper or Ask Questions

PersonaBank: A Corpus of Personal Narratives and Their Story Intention Graphs

Aug 30, 2017
Stephanie M. Lukin, Kevin Bowden, Casey Barackman, Marilyn A. Walker

We present a new corpus, PersonaBank, consisting of 108 personal stories from weblogs that have been annotated with their Story Intention Graphs, a deep representation of the fabula of a story. We describe the topics of the stories and the basis of the Story Intention Graph representation, as well as the process of annotating the stories to produce the Story Intention Graphs and the challenges of adapting the tool to this new personal narrative domain We also discuss how the corpus can be used in applications that retell the story using different styles of tellings, co-tellings, or as a content planner.

* International Conference on Language Resources and Evaluation (LREC 2016) 

  Access Paper or Ask Questions

Latent Tree Analysis

Oct 01, 2016
Nevin L. Zhang, Leonard K. M. Poon

Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables. It was proposed as an improvement to latent class analysis --- a method widely used in social sciences and medicine to identify homogeneous subgroups in a population. It provides new and fruitful perspectives on a number of machine learning areas, including cluster analysis, topic detection, and deep probabilistic modeling. This paper gives an overview of the research on latent tree analysis and various ways it is used in practice.

* 7 pages, 5 figures 

  Access Paper or Ask Questions

A System for Probabilistic Linking of Thesauri and Classification Systems

Mar 21, 2016
Lisa Posch, Philipp Schaer, Arnim Bleier, Markus Strohmaier

This paper presents a system which creates and visualizes probabilistic semantic links between concepts in a thesaurus and classes in a classification system. For creating the links, we build on the Polylingual Labeled Topic Model (PLL-TM). PLL-TM identifies probable thesaurus descriptors for each class in the classification system by using information from the natural language text of documents, their assigned thesaurus descriptors and their designated classes. The links are then presented to users of the system in an interactive visualization, providing them with an automatically generated overview of the relations between the thesaurus and the classification system.

* KI - K\"unstliche Intelligenz, 2015 

  Access Paper or Ask Questions

Predicting health inspection results from online restaurant reviews

Mar 17, 2016
Samantha Wong, Hamidreza Chinaei, Frank Rudzicz

Informatics around public health are increasingly shifting from the professional to the public spheres. In this work, we apply linguistic analytics to restaurant reviews, from Yelp, in order to automatically predict official health inspection reports. We consider two types of feature sets, i.e., keyword detection and topic model features, and use these in several classification methods. Our empirical analysis shows that these extracted features can predict public health inspection reports with over 90% accuracy using simple support vector machines.

* 7 pages, 2 figures, 2 tables 

  Access Paper or Ask Questions

SciRecSys: A Recommendation System for Scientific Publication by Discovering Keyword Relationships

Feb 27, 2015
Vu Le Anh, Vo Hoang Hai, Hung Nghiep Tran, Jason J. Jung

In this work, we propose a new approach for discovering various relationships among keywords over the scientific publications based on a Markov Chain model. It is an important problem since keywords are the basic elements for representing abstract objects such as documents, user profiles, topics and many things else. Our model is very effective since it combines four important factors in scientific publications: content, publicity, impact and randomness. Particularly, a recommendation system (called SciRecSys) has been presented to support users to efficiently find out relevant articles.

  Access Paper or Ask Questions

Bounding the Probability of Causation in Mediation Analysis

Nov 10, 2014
A. P. Dawid, R. Murtas, M. Musio

Given empirical evidence for the dependence of an outcome variable on an exposure variable, we can typically only provide bounds for the "probability of causation" in the case of an individual who has developed the outcome after being exposed. We show how these bounds can be adapted or improved if further information becomes available. In addition to reviewing existing work on this topic, we provide a new analysis for the case where a mediating variable can be observed. In particular we show how the probability of causation can be bounded when there is no direct effect and no confounding. Keywords: Causal inference, Mediation Analysis, Probability of Causation

* 9 pages, 1 figure, 3 tables 

  Access Paper or Ask Questions