Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Leveraging Deep Neural Networks and Knowledge Graphs for Entity Disambiguation

Apr 28, 2015
Hongzhao Huang, Larry Heck, Heng Ji

Entity Disambiguation aims to link mentions of ambiguous entities to a knowledge base (e.g., Wikipedia). Modeling topical coherence is crucial for this task based on the assumption that information from the same semantic context tends to belong to the same topic. This paper presents a novel deep semantic relatedness model (DSRM) based on deep neural networks (DNN) and semantic knowledge graphs (KGs) to measure entity semantic relatedness for topical coherence modeling. The DSRM is directly trained on large-scale KGs and it maps heterogeneous types of knowledge of an entity from KGs to numerical feature vectors in a latent space such that the distance between two semantically-related entities is minimized. Compared with the state-of-the-art relatedness approach proposed by (Milne and Witten, 2008a), the DSRM obtains 19.4% and 24.5% reductions in entity disambiguation errors on two publicly available datasets respectively.

  Access Paper or Ask Questions

Batch Clustering for Multilingual News Streaming

Apr 17, 2020
Mathis Linger, Mhamed Hajaiej

Nowadays, digital news articles are widely available, published by various editors and often written in different languages. This large volume of diverse and unorganized information makes human reading very difficult or almost impossible. This leads to a need for algorithms able to arrange high amount of multilingual news into stories. To this purpose, we extend previous works on Topic Detection and Tracking, and propose a new system inspired from newsLens. We process articles per batch, looking for monolingual local topics which are then linked across time and languages. Here, we introduce a novel "replaying" strategy to link monolingual local topics into stories. Besides, we propose new fine tuned multilingual embedding using SBERT to create crosslingual stories. Our system gives monolingual state-of-the-art results on dataset of Spanish and German news and crosslingual state-of-the-art results on English, Spanish and German news.

* Proceedings of Text2Story - Third Workshop on Narrative Extraction From Texts co-located with 42nd European Conference on Information Retrieval (ECIR 2020) Lisbon, Portugal, April 14th, 2020 
* 7 pages, 2 figures 

  Access Paper or Ask Questions

A novel pLSA based Traffic Signs Classification System

Mar 23, 2015
Mrinal Haloi

In this work we developed a novel and fast traffic sign recognition system, a very important part for advanced driver assistance system and for autonomous driving. Traffic signs play a very vital role in safe driving and avoiding accident. We have used image processing and topic discovery model pLSA to tackle this challenging multiclass classification problem. Our algorithm is consist of two parts, shape classification and sign classification for improved accuracy. For processing and representation of image we have used bag of features model with SIFT local descriptor. Where a visual vocabulary of size 300 words are formed using k-means codebook formation algorithm. We exploited the concept that every image is a collection of visual topics and images having same topics will belong to same category. Our algorithm is tested on German traffic sign recognition benchmark (GTSRB) and gives very promising result near to existing state of the art techniques.

* APMediaCast-2015, Bali, Indonesia 

  Access Paper or Ask Questions

Guiding Neural Story Generation with Reader Models

Dec 16, 2021
Xiangyu Peng, Kaige Xie, Amal Alabdulkarim, Harshith Kayam, Samihan Dani, Mark O. Riedl

Automated storytelling has long captured the attention of researchers for the ubiquity of narratives in everyday life. However, it is challenging to maintain coherence and stay on-topic toward a specific ending when generating narratives with neural language models. In this paper, we introduce Story generation with Reader Models (StoRM), a framework in which a reader model is used to reason about the story should progress. A reader model infers what a human reader believes about the concepts, entities, and relations about the fictional story world. We show how an explicit reader model represented as a knowledge graph affords story coherence and provides controllability in the form of achieving a given story world state goal. Experiments show that our model produces significantly more coherent and on-topic stories, outperforming baselines in dimensions including plot plausibility and staying on topic. Our system also outperforms outline-guided story generation baselines in composing given concepts without ordering.

  Access Paper or Ask Questions

An Analysis of COVID-19 Knowledge Graph Construction and Applications

Oct 10, 2021
Dominic Flocco, Bryce Palmer-Toy, Ruixiao Wang, Hongyu Zhu, Rishi Sonthalia, Junyuan Lin, Andrea L. Bertozzi, P. Jeffrey Brantingham

The construction and application of knowledge graphs have seen a rapid increase across many disciplines in recent years. Additionally, the problem of uncovering relationships between developments in the COVID-19 pandemic and social media behavior is of great interest to researchers hoping to curb the spread of the disease. In this paper we present a knowledge graph constructed from COVID-19 related tweets in the Los Angeles area, supplemented with federal and state policy announcements and disease spread statistics. By incorporating dates, topics, and events as entities, we construct a knowledge graph that describes the connections between these useful information. We use natural language processing and change point analysis to extract tweet-topic, tweet-date, and event-date relations. Further analysis on the constructed knowledge graph provides insight into how tweets reflect public sentiments towards COVID-19 related topics and how changes in these sentiments correlate with real-world events.

  Access Paper or Ask Questions

Improving unsupervised neural aspect extraction for online discussions using out-of-domain classification

Jun 17, 2020
Anton Alekseev, Elena Tutubalina, Valentin Malykh, Sergey Nikolenko

Deep learning architectures based on self-attention have recently achieved and surpassed state of the art results in the task of unsupervised aspect extraction and topic modeling. While models such as neural attention-based aspect extraction (ABAE) have been successfully applied to user-generated texts, they are less coherent when applied to traditional data sources such as news articles and newsgroup documents. In this work, we introduce a simple approach based on sentence filtering in order to improve topical aspects learned from newsgroups-based content without modifying the basic mechanism of ABAE. We train a probabilistic classifier to distinguish between out-of-domain texts (outer dataset) and in-domain texts (target dataset). Then, during data preparation we filter out sentences that have a low probability of being in-domain and train the neural model on the remaining sentences. The positive effect of sentence filtering on topic coherence is demonstrated in comparison to aspect extraction models trained on unfiltered texts.

* Journal of Intelligent & Fuzzy Systems, pre-press, 

  Access Paper or Ask Questions

Building Ethics into Artificial Intelligence

Dec 07, 2018
Han Yu, Zhiqi Shen, Chunyan Miao, Cyril Leung, Victor R. Lesser, Qiang Yang

As artificial intelligence (AI) systems become increasingly ubiquitous, the topic of AI governance for ethical decision-making by AI has captured public imagination. Within the AI research community, this topic remains less familiar to many researchers. In this paper, we complement existing surveys, which largely focused on the psychological, social and legal discussions of the topic, with an analysis of recent advances in technical solutions for AI governance. By reviewing publications in leading AI conferences including AAAI, AAMAS, ECAI and IJCAI, we propose a taxonomy which divides the field into four areas: 1) exploring ethical dilemmas; 2) individual ethical decision frameworks; 3) collective ethical decision frameworks; and 4) ethics in human-AI interactions. We highlight the intuitions and key techniques used in each approach, and discuss promising future research directions towards successful integration of ethical AI systems into human societies.

* H. Yu, Z. Shen, C. Miao, C. Leung, V. R. Lesser & Q. Yang, "Building Ethics into Artificial Intelligence," in Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI'18), pp. 5527-5533, 2018 

  Access Paper or Ask Questions

SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations

May 12, 2017
Dheeraj Mekala, Vivek Gupta, Bhargavi Paranjape, Harish Karnick

We present a feature vector formation technique for documents - Sparse Composite Document Vector (SCDV) - which overcomes several shortcomings of the current distributional paragraph vector representations that are widely used for text representation. In SCDV, word embedding's are clustered to capture multiple semantic contexts in which words occur. They are then chained together to form document topic-vectors that can express complex, multi-topic documents. Through extensive experiments on multi-class and multi-label classification tasks, we outperform the previous state-of-the-art method, NTSG (Liu et al., 2015a). We also show that SCDV embedding's perform well on heterogeneous tasks like Topic Coherence, context-sensitive Learning and Information Retrieval. Moreover, we achieve significant reduction in training and prediction times compared to other representation methods. SCDV achieves best of both worlds - better performance with lower time and space complexity.

* 10 pages, 5 figures. Update: Added results on Information Retrieval and Topic Coherence with Discussion 

  Access Paper or Ask Questions

Modeling Human Behavior Part II -- Cognitive approaches and Uncertainty

May 13, 2022
Andrew Fuchs, Andrea Passarella, Marco Conti

As we discussed in Part I of this topic, there is a clear desire to model and comprehend human behavior. Given the popular presupposition of human reasoning as the standard for learning and decision-making, there have been significant efforts and a growing trend in research to replicate these innate human abilities in artificial systems. In Part I, we discussed learning methods which generate a model of behavior from exploration of the system and feedback based on the exhibited behavior as well as topics relating to the use of or accounting for beliefs with respect to applicable skills or mental states of others. In this work, we will continue the discussion from the perspective of methods which focus on the assumed cognitive abilities, limitations, and biases demonstrated in human reasoning. We will arrange these topics as follows (i) methods such as cognitive architectures, cognitive heuristics, and related which demonstrate assumptions of limitations on cognitive resources and how that impacts decisions and (ii) methods which generate and utilize representations of bias or uncertainty to model human decision-making or the future outcomes of decisions.

* This is Part 2 of our review (see Modeling Human Behavior Part I - Learning and Belief Approaches) relating to learning and modeling behavior. This work was partially funded by the following projects. European Union's Horizon 2020 research and innovation programme: HumaneAI-Net (No 952026). CHIST-ERA program: SAI project (grant CHIST-ERA-19-XAI-010, funded by MUR, grant number not yet available) 

  Access Paper or Ask Questions

Library of Congress Subject Heading (LCSH) Browsing and Natural Language Searching

Sep 30, 2021
Charles-Antoine Julien, Banafsheh Asadi, Jesse David Dinneen, Fei Shu

Controlled topical vocabularies (CVs) are built into information systems to aid browsing and retrieval of items that may be unfamiliar, but it is unclear how this feature should be integrated with standard keyword searching. Few systems or scholarly prototypes have attempted this, and none have used the most widely used CV, the Library of Congress Subject Headings (LCSH), which organizes monograph collections in academic libraries throughout the world. This paper describes a working prototype of a Web application that concurrently allows topic exploration using an outline tree view of the LCSH hierarchy and natural language keyword searching of a real-world Science and Engineering bibliographic collection. Pilot testing shows the system is functional, and work to fit the complex LCSH structure into a usable hierarchy is ongoing. This study contributes to knowledge of the practical design decisions required when developing linked interactions between topical hierarchy browsing and natural language searching, which promise to facilitate information discovery and exploration.

* In ASIST 2016: Proceedings of the 79th Annual Meeting of the Association for Information Science & Technology, 53 
* conference paper (ASIST '16), 4 pages plus a poster 

  Access Paper or Ask Questions