Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Topic Modeling": models, code, and papers

Exploring Context Modeling Techniques on the Spatiotemporal Crowd Flow Prediction

Jun 30, 2021
Liyue Chen, Leye Wang

In the big data and AI era, context is widely exploited as extra information which makes it easier to learn a more complex pattern in machine learning systems. However, most of the existing related studies seldom take context into account. The difficulty lies in the unknown generalization ability of both context and its modeling techniques across different scenarios. To fill the above gaps, we conduct a large-scale analytical and empirical study on the spatiotemporal crowd prediction (STCFP) problem that is a widely-studied and hot research topic. We mainly make three efforts:(i) we develop new taxonomy about both context features and context modeling techniques based on extensive investigations in prevailing STCFP research; (ii) we conduct extensive experiments on seven datasets with hundreds of millions of records to quantitatively evaluate the generalization ability of both distinct context features and context modeling techniques; (iii) we summarize some guidelines for researchers to conveniently utilize context in diverse applications.

Access Paper or Ask Questions

Dependent Multinomial Models Made Easy: Stick Breaking with the P贸lya-Gamma Augmentation

Jun 18, 2015
Scott W. Linderman, Matthew J. Johnson, Ryan P. Adams

Many practical modeling problems involve discrete data that are best represented as draws from multinomial or categorical distributions. For example, nucleotides in a DNA sequence, children's names in a given state and year, and text documents are all commonly modeled with multinomial distributions. In all of these cases, we expect some form of dependency between the draws: the nucleotide at one position in the DNA strand may depend on the preceding nucleotides, children's names are highly correlated from year to year, and topics in text may be correlated and dynamic. These dependencies are not naturally captured by the typical Dirichlet-multinomial formulation. Here, we leverage a logistic stick-breaking representation and recent innovations in P\'olya-gamma augmentation to reformulate the multinomial distribution in terms of latent variables with jointly Gaussian likelihoods, enabling us to take advantage of a host of Bayesian inference techniques for Gaussian models with minimal overhead.

Access Paper or Ask Questions

Thousand to One: Semantic Prior Modeling for Conceptual Coding

Mar 16, 2021
Jianhui Chang, Zhenghui Zhao, Lingbo Yang, Chuanmin Jia, Jian Zhang, Siwei Ma

Conceptual coding has been an emerging research topic recently, which encodes natural images into disentangled conceptual representations for compression. However, the compression performance of the existing methods is still sub-optimal due to the lack of comprehensive consideration of rate constraint and reconstruction quality. To this end, we propose a novel end-to-end semantic prior modeling-based conceptual coding scheme towards extremely low bitrate image compression, which leverages semantic-wise deep representations as a unified prior for entropy estimation and texture synthesis. Specifically, we employ semantic segmentation maps as structural guidance for extracting deep semantic prior, which provides fine-grained texture distribution modeling for better detail construction and higher flexibility in subsequent high-level vision tasks. Moreover, a cross-channel entropy model is proposed to further exploit the inter-channel correlation of the spatially independent semantic prior, leading to more accurate entropy estimation for rate-constrained training. The proposed scheme achieves an ultra-high 1000x compression ratio, while still enjoying high visual reconstruction quality and versatility towards visual processing and analysis tasks.

* ICME 2021 ORAL accepted 
Access Paper or Ask Questions

Explaining predictive models with mixed features using Shapley values and conditional inference trees

Jul 02, 2020
Annabelle Redelmeier, Martin Jullum, Kjersti Aas

It is becoming increasingly important to explain complex, black-box machine learning models. Although there is an expanding literature on this topic, Shapley values stand out as a sound method to explain predictions from any type of machine learning model. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. This methodology was then extended to explain dependent features with an underlying continuous distribution. In this paper, we propose a method to explain mixed (i.e. continuous, discrete, ordinal, and categorical) dependent features by modeling the dependence structure of the features using conditional inference trees. We demonstrate our proposed method against the current industry standards in various simulation studies and find that our method often outperforms the other approaches. Finally, we apply our method to a real financial data set used in the 2018 FICO Explainable Machine Learning Challenge and show how our explanations compare to the FICO challenge Recognition Award winning team.

Access Paper or Ask Questions

I Know Where You Are Coming From: On the Impact of Social Media Sources on AI Model Performance

Feb 05, 2020
Qi Yang, Aleksandr Farseev, Andrey Filchenkov

Nowadays, social networks play a crucial role in human everyday life and no longer purely associated with spare time spending. In fact, instant communication with friends and colleagues has become an essential component of our daily interaction giving a raise of multiple new social network types emergence. By participating in such networks, individuals generate a multitude of data points that describe their activities from different perspectives and, for example, can be further used for applications such as personalized recommendation or user profiling. However, the impact of the different social media networks on machine learning model performance has not been studied comprehensively yet. Particularly, the literature on modeling multi-modal data from multiple social networks is relatively sparse, which had inspired us to take a deeper dive into the topic in this preliminary study. Specifically, in this work, we will study the performance of different machine learning models when being learned on multi-modal data from different social networks. Our initial experimental results reveal that social network choice impacts the performance and the proper selection of data source is crucial.

* AAAI-20 
Access Paper or Ask Questions

Investor Reaction to Financial Disclosures Across Topics: An Application of Latent Dirichlet Allocation

May 08, 2018
Stefan Feuerriegel, Nicolas Pr枚llochs

This paper provides a holistic study of how stock prices vary in their response to financial disclosures across different topics. Thereby, we specifically shed light into the extensive amount of filings for which no a priori categorization of their content exists. For this purpose, we utilize an approach from data mining - namely, latent Dirichlet allocation - as a means of topic modeling. This technique facilitates our task of automatically categorizing, ex ante, the content of more than 70,000 regulatory 8-K filings from U.S. companies. We then evaluate the subsequent stock market reaction. Our empirical evidence suggests a considerable discrepancy among various types of news stories in terms of their relevance and impact on financial markets. For instance, we find a statistically significant abnormal return in response to earnings results and credit rating, but also for disclosures regarding business strategy, the health sector, as well as mergers and acquisitions. Our results yield findings that benefit managers, investors and policy-makers by indicating how regulatory filings should be structured and the topics most likely to precede changes in stock valuations.

Access Paper or Ask Questions

Innovative Bert-based Reranking Language Models for Speech Recognition

Apr 11, 2021
Shih-Hsuan Chiu, Berlin Chen

More recently, Bidirectional Encoder Representations from Transformers (BERT) was proposed and has achieved impressive success on many natural language processing (NLP) tasks such as question answering and language understanding, due mainly to its effective pre-training then fine-tuning paradigm as well as strong local contextual modeling ability. In view of the above, this paper presents a novel instantiation of the BERT-based contextualized language models (LMs) for use in reranking of N-best hypotheses produced by automatic speech recognition (ASR). To this end, we frame N-best hypothesis reranking with BERT as a prediction problem, which aims to predict the oracle hypothesis that has the lowest word error rate (WER) given the N-best hypotheses (denoted by PBERT). In particular, we also explore to capitalize on task-specific global topic information in an unsupervised manner to assist PBERT in N-best hypothesis reranking (denoted by TPBERT). Extensive experiments conducted on the AMI benchmark corpus demonstrate the effectiveness and feasibility of our methods in comparison to the conventional autoregressive models like the recurrent neural network (RNN) and a recently proposed method that employed BERT to compute pseudo-log-likelihood (PLL) scores for N-best hypothesis reranking.

* 6 pages, 3 figures, Published in IEEE SLT 2021 
Access Paper or Ask Questions

Approximating the Void: Learning Stochastic Channel Models from Observation with Variational Generative Adversarial Networks

Aug 20, 2018
Timothy J. O'Shea, Tamoghna Roy, Nathan West

Channel modeling is a critical topic when considering designing, learning, or evaluating the performance of any communications system. Most prior work in designing or learning new modulation schemes has focused on using highly simplified analytic channel models such as additive white Gaussian noise (AWGN), Rayleigh fading channels or similar. Recently, we proposed the usage of a generative adversarial networks (GANs) to jointly approximate a wireless channel response model (e.g. from real black box measurements) and optimize for an efficient modulation scheme over it using machine learning. This approach worked to some degree, but was unable to produce accurate probability distribution functions (PDFs) representing the stochastic channel response. In this paper, we focus specifically on the problem of accurately learning a channel PDF using a variational GAN, introducing an architecture and loss function which can accurately capture stochastic behavior. We illustrate where our prior method failed and share results capturing the performance of such as system over a range of realistic channel distributions.

Access Paper or Ask Questions

Deep Importance Sampling based on Regression for Model Inversion and Emulation

Oct 20, 2020
F. Llorente, L. Martino, D. Delgado, G. Camps-Valls

Understanding systems by forward and inverse modeling is a recurrent topic of research in many domains of science and engineering. In this context, Monte Carlo methods have been widely used as powerful tools for numerical inference and optimization. They require the choice of a suitable proposal density that is crucial for their performance. For this reason, several adaptive importance sampling (AIS) schemes have been proposed in the literature. We here present an AIS framework called Regression-based Adaptive Deep Importance Sampling (RADIS). In RADIS, the key idea is the adaptive construction via regression of a non-parametric proposal density (i.e., an emulator), which mimics the posterior distribution and hence minimizes the mismatch between proposal and target densities. RADIS is based on a deep architecture of two (or more) nested IS schemes, in order to draw samples from the constructed emulator. The algorithm is highly efficient since employs the posterior approximation as proposal density, which can be improved adding more support points. As a consequence, RADIS asymptotically converges to an exact sampler under mild conditions. Additionally, the emulator produced by RADIS can be in turn used as a cheap surrogate model for further studies. We introduce two specific RADIS implementations that use Gaussian Processes (GPs) and Nearest Neighbors (NN) for constructing the emulator. Several numerical experiments and comparisons show the benefits of the proposed schemes. A real-world application in remote sensing model inversion and emulation confirms the validity of the approach.

Access Paper or Ask Questions