Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Topic Modeling": models, code, and papers

Improved Patient Classification with Language Model Pretraining Over Clinical Notes

Oct 02, 2019
Jonas Kemp, Alvin Rajkomar, Andrew M. Dai

Clinical notes in electronic health records contain highly heterogeneous writing styles, including non-standard terminology or abbreviations. Using these notes in predictive modeling has traditionally required preprocessing (e.g. taking frequent terms or topic modeling) that removes much of the richness of the source data. We propose a pretrained hierarchical recurrent neural network model that parses minimally processed clinical notes in an intuitive fashion, and show that it improves performance for multiple classification tasks on the Medical Information Mart for Intensive Care III (MIMIC-III) dataset, improving top-5 recall to 89.7% (increase of 4.8%) for primary diagnosis classification and AUPRC to 35.2% (increase of 2.1%) for multilabel diagnosis classification compared to models that treat the notes as an unordered collection of terms, using no pretraining. We also apply an attribution technique to several examples to identify the words and the nearby context that the model uses to make its prediction, and show the importance of the words' context.

* Accepted at NeurIPS ML4H 2019, extended abstract track 
Access Paper or Ask Questions

Non-Pharmaceutical Intervention Discovery with Topic Modeling

Sep 10, 2020
Jonathan Smith, Borna Ghotbi, Seungeun Yi, Mahboobeh Parsapoor

We consider the task of discovering categories of non-pharmaceutical interventions during the evolving COVID-19 pandemic. We explore topic modeling on two corpora with national and international scope. These models discover existing categories when compared with human intervention labels while reduced human effort needed.

* ML for Global Health (ICML 2020 Workshop) 
Access Paper or Ask Questions

Neural Models for Documents with Metadata

Oct 23, 2018
Dallas Card, Chenhao Tan, Noah A. Smith

Most real-world document collections involve various types of metadata, such as author, source, and date, and yet the most commonly-used approaches to modeling text corpora ignore this information. While specialized models have been developed for particular applications, few are widely used in practice, as customization typically requires derivation of a custom inference algorithm. In this paper, we build on recent advances in variational inference methods and propose a general neural framework, based on topic models, to enable flexible incorporation of metadata and allow for rapid exploration of alternative models. Our approach achieves strong performance, with a manageable tradeoff between perplexity, coherence, and sparsity. Finally, we demonstrate the potential of our framework through an exploration of a corpus of articles about US immigration.

* Dallas Card, Chenhao Tan, and Noah A. Smith. (2018). Neural Models for Documents with Metadata. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 
* 13 pages, 3 figures, 6 tables; updating to version published at ACL 2018 
Access Paper or Ask Questions

Inductive Granger Causal Modeling for Multivariate Time Series

Feb 10, 2021
Yunfei Chu, Xiaowei Wang, Jianxin Ma, Kunyang Jia, Jingren Zhou, Hongxia Yang

Granger causal modeling is an emerging topic that can uncover Granger causal relationship behind multivariate time series data. In many real-world systems, it is common to encounter a large amount of multivariate time series data collected from different individuals with sharing commonalities. However, there are ongoing concerns regarding Granger causality's applicability in such large scale complex scenarios, presenting both challenges and opportunities for Granger causal structure reconstruction. Existing methods usually train a distinct model for each individual, suffering from inefficiency and over-fitting issues. To bridge this gap, we propose an Inductive GRanger cAusal modeling (InGRA) framework for inductive Granger causality learning and common causal structure detection on multivariate time series, which exploits the shared commonalities underlying the different individuals. In particular, we train one global model for individuals with different Granger causal structures through a novel attention mechanism, called prototypical Granger causal attention. The model can detect common causal structures for different individuals and infer Granger causal structures for newly arrived individuals. Extensive experiments, as well as an online A/B test on an E-commercial advertising platform, demonstrate the superior performances of InGRA.

* 6 pages, 6 figures 
Access Paper or Ask Questions

Neuron Linear Transformation: Modeling the Domain Shift for Crowd Counting

Apr 05, 2020
Qi Wang, Tao Han, Junyu Gao, Yuan Yuan

Cross-domain crowd counting (CDCC) is a hot topic due to its importance in public safety. The purpose of CDCC is to reduce the domain shift between the source and target domain. Recently, typical methods attempt to extract domain-invariant features via image translation and adversarial learning. When it comes to specific tasks, we find that the final manifestation of the task gap is in the parameters of the model, and the domain shift can be represented apparently by the differences in model weights. To describe the domain gap directly at the parameter-level, we propose a Neuron Linear Transformation (NLT) method, where NLT is exploited to learn the shift at neuron-level and then transfer the source model to the target model. Specifically, for a specific neuron of a source model, NLT exploits few labeled target data to learn a group of parameters, which updates the target neuron via a linear transformation. Extensive experiments and analysis on six real-world datasets validate that NLT achieves top performance compared with other domain adaptation methods. An ablation study also shows that the NLT is robust and more effective compare with supervised and fine-tune training. Furthermore, we will release the code after the paper is accepted.

* 12 paegs, 8 figures 
Access Paper or Ask Questions

Preference Enhanced Social Influence Modeling for Network-Aware Cascade Prediction

Apr 18, 2022
Likang Wu, Hao Wang, Enhong Chen, Zhi Li, Hongke Zhao, Jianhui Ma

Network-aware cascade size prediction aims to predict the final reposted number of user-generated information via modeling the propagation process in social networks. Estimating the user's reposting probability by social influence, namely state activation plays an important role in the information diffusion process. Therefore, Graph Neural Networks (GNN), which can simulate the information interaction between nodes, has been proved as an effective scheme to handle this prediction task. However, existing studies including GNN-based models usually neglect a vital factor of user's preference which influences the state activation deeply. To that end, we propose a novel framework to promote cascade size prediction by enhancing the user preference modeling according to three stages, i.e., preference topics generation, preference shift modeling, and social influence activation. Our end-to-end method makes the user activating process of information diffusion more adaptive and accurate. Extensive experiments on two large-scale real-world datasets have clearly demonstrated the effectiveness of our proposed model compared to state-of-the-art baselines.

* SIGIR 2022 
Access Paper or Ask Questions

Integration of Physics-Based and Data-Driven Models for Hyperspectral Image Unmixing

Jun 11, 2022
Jie Chen, Min Zhao, Xiuheng Wang, Cédric Richard, Susanto Rahardja

Spectral unmixing is one of the most important quantitative analysis tasks in hyperspectral data processing. Conventional physics-based models are characterized by clear interpretation. However, due to the complex mixture mechanism and limited nonlinearity modeling capacity, these models may not be accurate, especially, in analyzing scenes with unknown physical characteristics. Data-driven methods have developed rapidly in recent years, in particular deep learning methods as they possess superior capability in modeling complex and nonlinear systems. Simply transferring these methods as black-boxes to conduct unmixing may lead to low physical interpretability and generalization ability. Consequently, several contributions have been dedicated to integrating advantages of both physics-based models and data-driven methods. In this article, we present an overview of recent advances on this topic from several aspects, including deep neural network (DNN) structures design, prior capturing and loss design, and summarise these methods in a common mathematical optimization framework. In addition, relevant remarks and discussions are conducted made for providing further understanding and prospective improvement of the methods. The related source codes and data are collected and made available at

* IEEE Signal Process. Mag. Manuscript submitted March 14, 2022 
Access Paper or Ask Questions

Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation

Aug 01, 2019
Jing Wang, Yingwei Pan, Ting Yao, Jinhui Tang, Tao Mei

Image paragraph generation is the task of producing a coherent story (usually a paragraph) that describes the visual content of an image. The problem nevertheless is not trivial especially when there are multiple descriptive and diverse gists to be considered for paragraph generation, which often happens in real images. A valid question is how to encapsulate such gists/topics that are worthy of mention from an image, and then describe the image from one topic to another but holistically with a coherent structure. In this paper, we present a new design --- Convolutional Auto-Encoding (CAE) that purely employs convolutional and deconvolutional auto-encoding framework for topic modeling on the region-level features of an image. Furthermore, we propose an architecture, namely CAE plus Long Short-Term Memory (dubbed as CAE-LSTM), that novelly integrates the learnt topics in support of paragraph generation. Technically, CAE-LSTM capitalizes on a two-level LSTM-based paragraph generation framework with attention mechanism. The paragraph-level LSTM captures the inter-sentence dependency in a paragraph, while sentence-level LSTM is to generate one sentence which is conditioned on each learnt topic. Extensive experiments are conducted on Stanford image paragraph dataset, and superior results are reported when comparing to state-of-the-art approaches. More remarkably, CAE-LSTM increases CIDEr performance from 20.93% to 25.15%.

* IJCAI 2019 
Access Paper or Ask Questions

Model Fusion with Kullback--Leibler Divergence

Jul 13, 2020
Sebastian Claici, Mikhail Yurochkin, Soumya Ghosh, Justin Solomon

We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors and proceeds using a simple assign-and-average approach. The components of the dataset posteriors are assigned to the proposed global model components by solving a regularized variant of the assignment problem. The global components are then updated based on these assignments by their mean under a KL divergence. For exponential family variational distributions, our formulation leads to an efficient non-parametric algorithm for computing the fused model. Our algorithm is easy to describe and implement, efficient, and competitive with state-of-the-art on motion capture analysis, topic modeling, and federated learning of Bayesian neural networks.

* ICML 2020 
Access Paper or Ask Questions

A Nested HDP for Hierarchical Topic Models

Jan 16, 2013
John Paisley, Chong Wang, David Blei, Michael I. Jordan

We develop a nested hierarchical Dirichlet process (nHDP) for hierarchical topic modeling. The nHDP is a generalization of the nested Chinese restaurant process (nCRP) that allows each word to follow its own path to a topic node according to a document-specific distribution on a shared tree. This alleviates the rigid, single-path formulation of the nCRP, allowing a document to more easily express thematic borrowings as a random effect. We demonstrate our algorithm on 1.8 million documents from The New York Times.

* Submitted to the workshop track of the International Conference on Learning Representations 2013. It is a short version of a longer paper 
Access Paper or Ask Questions