Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Tensorized Embedding Layers for Efficient Model Compression

Jan 30, 2019
Valentin Khrulkov, Oleksii Hrinchuk, Leyla Mirvakhabova, Ivan Oseledets

The embedding layers transforming input words into real vectors are the key components of deep neural networks used in natural language processing. However, when the vocabulary is large (e.g., 800k unique words in the One-Billion-Word dataset), the corresponding weight matrices can be enormous, which precludes their deployment in a limited resource setting. We introduce a novel way of parametrizing embedding layers based on the Tensor Train (TT) decomposition, which allows compressing the model significantly at the cost of a negligible drop or even a slight gain in performance. Importantly, our method does not take the pre-trained model and compress its weights but rather supplants the standard embedding layers with their TT-based counterparts. The resulting model is then trained end-to-end, however, it can capitalize on larger batches due to the reduced memory requirements. We evaluate our method on a wide range of benchmarks in sentiment analysis, neural machine translation, and language modeling, and analyze the trade-off between performance and compression ratios for a wide range of architectures, from MLPs to LSTMs and Transformers.

  Access Paper or Ask Questions

Extracting Predictive Information from Heterogeneous Data Streams using Gaussian Processes

Jul 11, 2018
Sid Ghoshal, Stephen Roberts

Financial markets are notoriously complex environments, presenting vast amounts of noisy, yet potentially informative data. We consider the problem of forecasting financial time series from a wide range of information sources using online Gaussian Processes with Automatic Relevance Determination (ARD) kernels. We measure the performance gain, quantified in terms of Normalised Root Mean Square Error (NRMSE), Median Absolute Deviation (MAD) and Pearson correlation, from fusing each of four separate data domains: time series technicals, sentiment analysis, options market data and broker recommendations. We show evidence that ARD kernels produce meaningful feature rankings that help retain salient inputs and reduce input dimensionality, providing a framework for sifting through financial complexity. We measure the performance gain from fusing each domain's heterogeneous data streams into a single probabilistic model. In particular our findings highlight the critical value of options data in mapping out the curvature of price space and inspire an intuitive, novel direction for research in financial prediction.

* 15 pages, 5 figures, accepted for publication in Algorithmic Finance 

  Access Paper or Ask Questions

Stylistic Variation in Social Media Part-of-Speech Tagging

Apr 19, 2018
Murali Raghu Babu Balusu, Taha Merghani, Jacob Eisenstein

Social media features substantial stylistic variation, raising new challenges for syntactic analysis of online writing. However, this variation is often aligned with author attributes such as age, gender, and geography, as well as more readily-available social network metadata. In this paper, we report new evidence on the link between language and social networks in the task of part-of-speech tagging. We find that tagger error rates are correlated with network structure, with high accuracy in some parts of the network, and lower accuracy elsewhere. As a result, tagger accuracy depends on training from a balanced sample of the network, rather than training on texts from a narrow subcommunity. We also describe our attempts to add robustness to stylistic variation, by building a mixture-of-experts model in which each expert is associated with a region of the social network. While prior work found that similar approaches yield performance improvements in sentiment analysis and entity linking, we were unable to obtain performance improvements in part-of-speech tagging, despite strong evidence for the link between part-of-speech error rates and social network structure.

* 9 pages, Published in Proceedings of NAACL workshop on stylistic variation (2018) 

  Access Paper or Ask Questions

Generative Concatenative Nets Jointly Learn to Write and Classify Reviews

Apr 07, 2016
Zachary C. Lipton, Sharad Vikram, Julian McAuley

A recommender system's basic task is to estimate how users will respond to unseen items. This is typically modeled in terms of how a user might rate a product, but here we aim to extend such approaches to model how a user would write about the product. To do so, we design a character-level Recurrent Neural Network (RNN) that generates personalized product reviews. The network convincingly learns styles and opinions of nearly 1000 distinct authors, using a large corpus of reviews from It also tailors reviews to describe specific items, categories, and star ratings. Using a simple input replication strategy, the Generative Concatenative Network (GCN) preserves the signal of static auxiliary inputs across wide sequence intervals. Without any additional training, the generative model can classify reviews, identifying the author of the review, the product category, and the sentiment (rating), with remarkable accuracy. Our evaluation shows the GCN captures complex dynamics in text, such as the effect of negation, misspellings, slang, and large vocabularies gracefully absent any machinery explicitly dedicated to the purpose.

  Access Paper or Ask Questions

Unsupervised Cross-Domain Word Representation Learning

May 27, 2015
Danushka Bollegala, Takanori Maehara, Ken-ichi Kawarabayashi

Meaning of a word varies from one domain to another. Despite this important domain dependence in word semantics, existing word representation learning methods are bound to a single domain. Given a pair of \emph{source}-\emph{target} domains, we propose an unsupervised method for learning domain-specific word representations that accurately capture the domain-specific aspects of word semantics. First, we select a subset of frequent words that occur in both domains as \emph{pivots}. Next, we optimize an objective function that enforces two constraints: (a) for both source and target domain documents, pivots that appear in a document must accurately predict the co-occurring non-pivots, and (b) word representations learnt for pivots must be similar in the two domains. Moreover, we propose a method to perform domain adaptation using the learnt word representations. Our proposed method significantly outperforms competitive baselines including the state-of-the-art domain-insensitive word representations, and reports best sentiment classification accuracies for all domain-pairs in a benchmark dataset.

* 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conferences on Natural Language Processing of the Asian Federation of Natural Language Processing 

  Access Paper or Ask Questions

Neutral Utterances are Also Causes: Enhancing Conversational Causal Emotion Entailment with Social Commonsense Knowledge

May 02, 2022
Jiangnan Li, Fandong Meng, Zheng Lin, Rui Liu, Peng Fu, Yanan Cao, Weiping Wang, Jie Zhou

Conversational Causal Emotion Entailment aims to detect causal utterances for a non-neutral targeted utterance from a conversation. In this work, we build conversations as graphs to overcome implicit contextual modelling of the original entailment style. Following the previous work, we further introduce the emotion information into graphs. Emotion information can markedly promote the detection of causal utterances whose emotion is the same as the targeted utterance. However, it is still hard to detect causal utterances with different emotions, especially neutral ones. The reason is that models are limited in reasoning causal clues and passing them between utterances. To alleviate this problem, we introduce social commonsense knowledge (CSK) and propose a Knowledge Enhanced Conversation graph (KEC). KEC propagates the CSK between two utterances. As not all CSK is emotionally suitable for utterances, we therefore propose a sentiment-realized knowledge selecting strategy to filter CSK. To process KEC, we further construct the Knowledge Enhanced Directed Acyclic Graph networks. Experimental results show that our method outperforms baselines and infers more causes with different emotions from the targeted utterance.

  Access Paper or Ask Questions

Twitter-Demographer: A Flow-based Tool to Enrich Twitter Data

Jan 26, 2022
Federico Bianchi, Vincenzo Cutrona, Dirk Hovy

Twitter data have become essential to Natural Language Processing (NLP) and social science research, driving various scientific discoveries in recent years. However, the textual data alone are often not enough to conduct studies: especially social scientists need more variables to perform their analysis and control for various factors. How we augment this information, such as users' location, age, or tweet sentiment, has ramifications for anonymity and reproducibility, and requires dedicated effort. This paper describes Twitter-Demographer, a simple, flow-based tool to enrich Twitter data with additional information about tweets and users. Twitter-Demographer is aimed at NLP practitioners and (computational) social scientists who want to enrich their datasets with aggregated information, facilitating reproducibility, and providing algorithmic privacy-by-design measures for pseudo-anonymity. We discuss our design choices, inspired by the flow-based programming paradigm, to use black-box components that can easily be chained together and extended. We also analyze the ethical issues related to the use of this tool, and the built-in measures to facilitate pseudo-anonymity.

  Access Paper or Ask Questions

Topical Language Generation using Transformers

Mar 11, 2021
Rohola Zandie, Mohammad H. Mahoor

Large-scale transformer-based language models (LMs) demonstrate impressive capabilities in open text generation. However, controlling the generated text's properties such as the topic, style, and sentiment is challenging and often requires significant changes to the model architecture or retraining and fine-tuning the model on new supervised data. This paper presents a novel approach for Topical Language Generation (TLG) by combining a pre-trained LM with topic modeling information. We cast the problem using Bayesian probability formulation with topic probabilities as a prior, LM probabilities as the likelihood, and topical language generation probability as the posterior. In learning the model, we derive the topic probability distribution from the user-provided document's natural structure. Furthermore, we extend our model by introducing new parameters and functions to influence the quantity of the topical features presented in the generated text. This feature would allow us to easily control the topical properties of the generated text. Our experimental results demonstrate that our model outperforms the state-of-the-art results on coherency, diversity, and fluency while being faster in decoding.

* Accepted in the Journal of Natural Language Engineering 

  Access Paper or Ask Questions

An Enhanced Text Classification to Explore Health based Indian Government Policy Tweets

Aug 18, 2020
Aarzoo Dhiman, Durga Toshniwal

Government-sponsored policy-making and scheme generations is one of the means of protecting and promoting the social, economic, and personal development of the citizens. The evaluation of effectiveness of these schemes done by government only provide the statistical information in terms of facts and figures which do not include the in-depth knowledge of public perceptions, experiences and views on the topic. In this research work, we propose an improved text classification framework that classifies the Twitter data of different health-based government schemes. The proposed framework leverages the language representation models (LR models) BERT, ELMO, and USE. However, these LR models have less real-time applicability due to the scarcity of the ample annotated data. To handle this, we propose a novel GloVe word embeddings and class-specific sentiments based text augmentation approach (named Mod-EDA) which boosts the performance of text classification task by increasing the size of labeled data. Furthermore, the trained model is leveraged to identify the level of engagement of citizens towards these policies in different communities such as middle-income and low-income groups.

* Accepted to KDD 2020: Applied Data Science for Healthcare Workshop (4 pages, 2 figures, 2 tables) 

  Access Paper or Ask Questions