Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Mapping Research Trajectories

Apr 25, 2022
Bastian Schäfermeier, Gerd Stumme, Tom Hanika

Steadily growing amounts of information, such as annually published scientific papers, have become so large that they elude an extensive manual analysis. Hence, to maintain an overview, automated methods for the mapping and visualization of knowledge domains are necessary and important, e.g., for scientific decision makers. Of particular interest in this field is the development of research topics of different entities (e.g., scientific authors and venues) over time. However, existing approaches for their analysis are only suitable for single entity types, such as venues, and they often do not capture the research topics or the time dimension in an easily interpretable manner. Hence, we propose a principled approach for \emph{mapping research trajectories}, which is applicable to all kinds of scientific entities that can be represented by sets of published papers. For this, we transfer ideas and principles from the geographic visualization domain, specifically trajectory maps and interactive geographic maps. Our visualizations depict the research topics of entities over time in a straightforward interpr. manner. They can be navigated by the user intuitively and restricted to specific elements of interest. The maps are derived from a corpus of research publications (i.e., titles and abstracts) through a combination of unsupervised machine learning methods. In a practical demonstrator application, we exemplify the proposed approach on a publication corpus from machine learning. We observe that our trajectory visualizations of 30 top machine learning venues and 1000 major authors in this field are well interpretable and are consistent with background knowledge drawn from the entities' publications. Next to producing interactive, interpr. visualizations supporting different kinds of analyses, our computed trajectories are suitable for trajectory mining applications in the future.

  Access Paper or Ask Questions

Deep Learning for Hyperspectral Image Classification: An Overview

Oct 26, 2019
Shutao Li, Weiwei Song, Leyuan Fang, Yushi Chen, Pedram Ghamisi, Jón Atli Benediktsson

Hyperspectral image (HSI) classification has become a hot topic in the field of remote sensing. In general, the complex characteristics of hyperspectral data make the accurate classification of such data challenging for traditional machine learning methods. In addition, hyperspectral imaging often deals with an inherently nonlinear relation between the captured spectral information and the corresponding materials. In recent years, deep learning has been recognized as a powerful feature-extraction tool to effectively address nonlinear problems and widely used in a number of image processing tasks. Motivated by those successful applications, deep learning has also been introduced to classify HSIs and demonstrated good performance. This survey paper presents a systematic review of deep learning-based HSI classification literatures and compares several strategies for this topic. Specifically, we first summarize the main challenges of HSI classification which cannot be effectively overcome by traditional machine learning methods, and also introduce the advantages of deep learning to handle these problems. Then, we build a framework which divides the corresponding works into spectral-feature networks, spatial-feature networks, and spectral-spatial-feature networks to systematically review the recent achievements in deep learning-based HSI classification. In addition, considering the fact that available training samples in the remote sensing field are usually very limited and training deep networks require a large number of samples, we include some strategies to improve classification performance, which can provide some guidelines for future studies on this topic. Finally, several representative deep learning-based classification methods are conducted on real HSIs in our experiments.

* IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 9, pp. 6690-6709, Sep. 2019 

  Access Paper or Ask Questions

Framing COVID-19: How we conceptualize and discuss the pandemic on Twitter

Apr 15, 2020
Philipp Wicke, Marianna M. Bolognesi

Doctors and nurses in these weeks are busy in the trenches, fighting against a new invisible enemy: Covid-19. Cities are locked down and civilians are besieged in their own homes, to prevent the spreading of the virus. War-related terminology is commonly used to frame the discourse around epidemics and diseases. Arguably the discourse around the current epidemic will make use of war-related metaphors too,not only in public discourse and the media, but also in the tweets written by non-experts of mass communication. We hereby present an analysis of the discourse around #Covid-19, based on a corpus of 200k tweets posted on Twitter during March and April 2020. Using topic modelling we first analyze the topics around which the discourse can be classified. Then, we show that the WAR framing is used to talk about specific topics, such as the virus treatment, but not others, such as the effects of social distancing on the population. We then measure and compare the popularity of the WAR frame to three alternative figurative frames (MONSTER, STORM and TSUNAMI) and a literal frame used as control (FAMILY). The results show that while the FAMILY literal frame covers a wider portion of the corpus, among the figurative framings WAR is the most frequently used, and thus arguably the most conventional one. However, we conclude, this frame is not apt to elaborate the discourse around many aspects involved in the current situation. Therefore, we conclude, in line with previous suggestions, a plethora of framing options, or a metaphor menu, may facilitate the communication of various aspects involved in the Covid-19-related discourse on the social media, and thus support civilians in the expression of their feelings, opinions and ideas during the current pandemic.

* 41 pages, 6 figures 

  Access Paper or Ask Questions

365 Dots in 2019: Quantifying Attention of News Sources

Mar 22, 2020
Alexander C. Nwala, Michele C. Weigle, Michael L. Nelson

We investigate the overlap of topics of online news articles from a variety of sources. To do this, we provide a platform for studying the news by measuring this overlap and scoring news stories according to the degree of attention in near-real time. This can enable multiple studies, including identifying topics that receive the most attention from news organizations and identifying slow news days versus major news days. Our application, StoryGraph, periodically (10-minute intervals) extracts the first five news articles from the RSS feeds of 17 US news media organizations across the partisanship spectrum (left, center, and right). From these articles, StoryGraph extracts named entities (PEOPLE, LOCATIONS, ORGANIZATIONS, etc.) and then represents each news article with its set of extracted named entities. Finally, StoryGraph generates a news similarity graph where the nodes represent news articles, and an edge between a pair of nodes represents a high degree of similarity between the nodes (similar news stories). Each news story within the news similarity graph is assigned an attention score which quantifies the amount of attention the topics in the news story receive collectively from the news media organizations. The StoryGraph service has been running since August 2017, and using this method, we determined that the top news story of 2018 was the "Kavanaugh hearings" with attention score of 25.85 on September 27, 2018. Similarly, the top news story for 2019 so far (2019-12-12) is "AG William Barr's release of his principal conclusions of the Mueller Report," with an attention score of 22.93 on March 24, 2019.

* This is an extended version of the paper accepted at Computation + Journalism Symposium 2020, which has been postponed because of COVID-19 

  Access Paper or Ask Questions

Conditional Hierarchical Bayesian Tucker Decomposition

Dec 12, 2019
Adam Sandler, Diego Klabjan, Yuan Luo

Our research focuses on studying and developing methods for reducing the dimensionality of large datasets, common in biomedical applications. A major problem when learning information about patients based on genetic sequencing data is that there are often more feature variables (genetic data) than observations (patients). This makes direct supervised learning difficult. One way of reducing the feature space is to use latent Dirichlet allocation in order to group genetic variants in an unsupervised manner. Latent Dirichlet allocation is a common model in natural language processing, which describes a document as a mixture of topics, each with a probability of generating certain words. This can be generalized as a Bayesian tensor decomposition to account for multiple feature variables. While we made some progress improving and modifying these methods, our significant contributions are with hierarchical topic modeling. We developed distinct methods of incorporating hierarchical topic modeling, based on nested Chinese restaurant processes and Pachinko Allocation Machine, into Bayesian tensor decompositions. We apply these models to predict whether or not patients have autism spectrum disorder based on genetic sequencing data. We examine a dataset from National Database for Autism Research consisting of paired siblings -- one with autism, and the other without -- and counts of their genetic variants. Additionally, we linked the genes with their Reactome biological pathways. We combine this information into a tensor of patients, counts of their genetic variants, and the membership of these genes in pathways. Once we decompose this tensor, we use logistic regression on the reduced features in order to predict if patients have autism. We also perform a similar analysis of a dataset of patients with one of four common types of cancer (breast, lung, prostate, and colorectal).

* 20 pages, added model evaluation and log-likelihood sections 

  Access Paper or Ask Questions

Conversational AI: The Science Behind the Alexa Prize

Jan 11, 2018
Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu Venkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, Eric King, Kate Bland, Amanda Wartick, Yi Pan, Han Song, Sk Jayadevan, Gene Hwang, Art Pettigrue

Conversational agents are exploding in popularity. However, much work remains in the area of social conversation as well as free-form conversation over a broad range of domains and topics. To advance the state of the art in conversational AI, Amazon launched the Alexa Prize, a 2.5-million-dollar university competition where sixteen selected university teams were challenged to build conversational agents, known as socialbots, to converse coherently and engagingly with humans on popular topics such as Sports, Politics, Entertainment, Fashion and Technology for 20 minutes. The Alexa Prize offers the academic community a unique opportunity to perform research with a live system used by millions of users. The competition provided university teams with real user conversational data at scale, along with the user-provided ratings and feedback augmented with annotations by the Alexa team. This enabled teams to effectively iterate and make improvements throughout the competition while being evaluated in real-time through live user interactions. To build their socialbots, university teams combined state-of-the-art techniques with novel strategies in the areas of Natural Language Understanding, Context Modeling, Dialog Management, Response Generation, and Knowledge Acquisition. To support the efforts of participating teams, the Alexa Prize team made significant scientific and engineering investments to build and improve Conversational Speech Recognition, Topic Tracking, Dialog Evaluation, Voice User Experience, and tools for traffic management and scalability. This paper outlines the advances created by the university teams as well as the Alexa Prize team to achieve the common goal of solving the problem of Conversational AI.

* Alexa.Prize.Proceedings accessed (2018)-01-01 
* 18 pages, 5 figures, Alexa Prize Proceedings Paper (, Alexa Prize University Competition to advance Conversational AI 

  Access Paper or Ask Questions

MP Twitter Engagement and Abuse Post-first COVID-19 Lockdown in the UK: White Paper

Mar 08, 2021
Tracie Farrell, Mehmet Bakir, Kalina Bontcheva

The UK has had a volatile political environment for some years now, with Brexit and leadership crises marking the past five years. With this work, we wanted to understand more about how the global health emergency, COVID-19, influences the amount, type or topics of abuse that UK politicians receive when engaging with the public. With this work, we wanted to understand more about how the global health emergency, COVID-19, influences the amount, type or topics of abuse that UK politicians receive when engaging with the public. This work covers the period of June to December 2020 and analyses Twitter abuse in replies to UK MPs. This work is a follow-up from our analysis of online abuse during the first four months of the COVID-19 pandemic in the UK. The paper examines overall abuse levels during this new seven month period, analyses reactions to members of different political parties and the UK government, and the relationship between online abuse and topics such as Brexit, government's COVID-19 response and policies, and social issues. In addition, we have also examined the presence of conspiracy theories posted in abusive replies to MPs during the period. We have found that abuse levels toward UK MPs were at an all-time high in December 2020 (5.4% of all reply tweets sent to MPs). This is almost 1% higher that the two months preceding the General Election. In a departure from the trend seen in the first four months of the pandemic, MPs from the Tory party received the highest percentage of abusive replies from July 2020 onward, which stays above 5% starting from September 2020 onward, as the COVID-19 crisis deepened and the Brexit negotiations with the EU started nearing completion.

  Access Paper or Ask Questions

Political Depolarization of News Articles Using Attribute-aware Word Embeddings

Jan 05, 2021
Ruibo Liu, Lili Wang, Chenyan Jia, Soroush Vosoughi

Political polarization in the US is on the rise. This polarization negatively affects the public sphere by contributing to the creation of ideological echo chambers. In this paper, we focus on addressing one of the factors that contributes to this polarity, polarized media. We introduce a framework for depolarizing news articles. Given an article on a certain topic with a particular ideological slant (eg., liberal or conservative), the framework first detects polar language in the article and then generates a new article with the polar language replaced with neutral expressions. To detect polar words, we train a multi-attribute-aware word embedding model that is aware of ideology and topics on 360k full-length media articles. Then, for text generation, we propose a new algorithm called Text Annealing Depolarization Algorithm (TADA). TADA retrieves neutral expressions from the word embedding model that not only decrease ideological polarity but also preserve the original argument of the text, while maintaining grammatical correctness. We evaluate our framework by comparing the depolarized output of our model in two modes, fully-automatic and semi-automatic, on 99 stories spanning 11 topics. Based on feedback from 161 human testers, our framework successfully depolarized 90.1% of paragraphs in semi-automatic mode and 78.3% of paragraphs in fully-automatic mode. Furthermore, 81.2% of the testers agree that the non-polar content information is well-preserved and 79% agree that depolarization does not harm semantic correctness when they compare the original text and the depolarized text. Our work shows that data-driven methods can help to locate political polarity and aid in the depolarization of articles.

* In Proceedings of the 15th International AAAI Conference on Weblogs and Social Media (ICWSM 2021) 

  Access Paper or Ask Questions