Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

On-the-Job Learning with Bayesian Decision Theory

Dec 07, 2015
Keenon Werling, Arun Chaganty, Percy Liang, Chris Manning

Our goal is to deploy a high-accuracy system starting with zero training examples. We consider an "on-the-job" setting, where as inputs arrive, we use real-time crowdsourcing to resolve uncertainty where needed and output our prediction when confident. As the model improves over time, the reliance on crowdsourcing queries decreases. We cast our setting as a stochastic game based on Bayesian decision theory, which allows us to balance latency, cost, and accuracy objectives in a principled way. Computing the optimal policy is intractable, so we develop an approximation based on Monte Carlo Tree Search. We tested our approach on three datasets---named-entity recognition, sentiment classification, and image classification. On the NER task we obtained more than an order of magnitude reduction in cost compared to full human annotation, while boosting performance relative to the expert provided labels. We also achieve a 8% F1 improvement over having a single human label the whole set, and a 28% F1 improvement over online learning.

* As appearing in NIPS 2015 

  Access Paper or Ask Questions

Distributed Representations of Sentences and Documents

May 22, 2014
Quoc V. Le, Tomas Mikolov

Many machine learning algorithms require the input to be represented as a fixed-length feature vector. When it comes to texts, one of the most common fixed-length features is bag-of-words. Despite their popularity, bag-of-words features have two major weaknesses: they lose the ordering of the words and they also ignore semantics of the words. For example, "powerful," "strong" and "Paris" are equally distant. In this paper, we propose Paragraph Vector, an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents. Our algorithm represents each document by a dense vector which is trained to predict words in the document. Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models. Empirical results show that Paragraph Vectors outperform bag-of-words models as well as other techniques for text representations. Finally, we achieve new state-of-the-art results on several text classification and sentiment analysis tasks.

  Access Paper or Ask Questions

Disentangled Learning of Stance and Aspect Topics for Vaccine Attitude Detection in Social Media

May 06, 2022
Lixing Zhu, Zheng Fang, Gabriele Pergola, Rob Procter, Yulan He

Building models to detect vaccine attitudes on social media is challenging because of the composite, often intricate aspects involved, and the limited availability of annotated data. Existing approaches have relied heavily on supervised training that requires abundant annotations and pre-defined aspect categories. Instead, with the aim of leveraging the large amount of unannotated data now available on vaccination, we propose a novel semi-supervised approach for vaccine attitude detection, called VADet. A variational autoencoding architecture based on language models is employed to learn from unlabelled data the topical information of the domain. Then, the model is fine-tuned with a few manually annotated examples of user attitudes. We validate the effectiveness of VADet on our annotated data and also on an existing vaccination corpus annotated with opinions on vaccines. Our results show that VADet is able to learn disentangled stance and aspect topics, and outperforms existing aspect-based sentiment analysis models on both stance detection and tweet clustering.

  Access Paper or Ask Questions

Automatic Fake News Detection: Are current models "fact-checking" or "gut-checking"?

Apr 14, 2022
Ian Kelk, Benjamin Basseri, Wee Yi Lee, Richard Qiu, Chris Tanner

Automatic fake news detection models are ostensibly based on logic, where the truth of a claim made in a headline can be determined by supporting or refuting evidence found in a resulting web query. These models are believed to be reasoning in some way; however, it has been shown that these same results, or better, can be achieved without considering the claim at all -- only the evidence. This implies that other signals are contained within the examined evidence, and could be based on manipulable factors such as emotion, sentiment, or part-of-speech (POS) frequencies, which are vulnerable to adversarial inputs. We neutralize some of these signals through multiple forms of both neural and non-neural pre-processing and style transfer, and find that this flattening of extraneous indicators can induce the models to actually require both claims and evidence to perform well. We conclude with the construction of a model using emotion vectors built off a lexicon and passed through an "emotional attention" mechanism to appropriately weight certain emotions. We provide quantifiable results that prove our hypothesis that manipulable features are being used for fact-checking.

* 8 pages, 4 figures, 1 table, To appear in The Fifth FEVER Workshop 26th May 2022 Co-located with ACL 2022 

  Access Paper or Ask Questions

XLM-T: A Multilingual Language Model Toolkit for Twitter

Apr 25, 2021
Francesco Barbieri, Luis Espinosa Anke, Jose Camacho-Collados

Language models are ubiquitous in current NLP, and their multilingual capacity has recently attracted considerable attention. However, current analyses have almost exclusively focused on (multilingual variants of) standard benchmarks, and have relied on clean pre-training and task-specific corpora as multilingual signals. In this paper, we introduce XLM-T, a framework for using and evaluating multilingual language models in Twitter. This framework features two main assets: (1) a strong multilingual baseline consisting of an XLM-R (Conneau et al. 2020) model pre-trained on millions of tweets in over thirty languages, alongside starter code to subsequently fine-tune on a target task; and (2) a set of unified sentiment analysis Twitter datasets in eight different languages. This is a modular framework that can easily be extended to additional tasks, as well as integrated with recent efforts also aimed at the homogenization of Twitter-specific datasets (Barbieri et al. 2020).

* Submitted to ACL demo. Code and data available at 

  Access Paper or Ask Questions

AngryBERT: Joint Learning Target and Emotion for Hate Speech Detection

Mar 14, 2021
Md Rabiul Awal, Rui Cao, Roy Ka-Wei Lee, Sandra Mitrovic

Automated hate speech detection in social media is a challenging task that has recently gained significant traction in the data mining and Natural Language Processing community. However, most of the existing methods adopt a supervised approach that depended heavily on the annotated hate speech datasets, which are imbalanced and often lack training samples for hateful content. This paper addresses the research gaps by proposing a novel multitask learning-based model, AngryBERT, which jointly learns hate speech detection with sentiment classification and target identification as secondary relevant tasks. We conduct extensive experiments to augment three commonly-used hate speech detection datasets. Our experiment results show that AngryBERT outperforms state-of-the-art single-task-learning and multitask learning baselines. We conduct ablation studies and case studies to empirically examine the strengths and characteristics of our AngryBERT model and show that the secondary tasks are able to improve hate speech detection.

* Paper Accepted for 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining 

  Access Paper or Ask Questions

Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach

Feb 28, 2021
Anshul Wadhawan, Akshita Aggarwal

In the last few years, emotion detection in social-media text has become a popular problem due to its wide ranging application in better understanding the consumers, in psychology, in aiding human interaction with computers, designing smart systems etc. Because of the availability of huge amounts of data from social-media, which is regularly used for expressing sentiments and opinions, this problem has garnered great attention. In this paper, we present a Hinglish dataset labelled for emotion detection. We highlight a deep learning based approach for detecting emotions in Hindi-English code mixed tweets, using bilingual word embeddings derived from FastText and Word2Vec approaches, as well as transformer based models. We experiment with various deep learning models, including CNNs, LSTMs, Bi-directional LSTMs (with and without attention), along with transformers like BERT, RoBERTa, and ALBERT. The transformer based BERT model outperforms all other models giving the best performance with an accuracy of 71.43%.

  Access Paper or Ask Questions

A Stochastic Time Series Model for Predicting Financial Trends using NLP

Feb 02, 2021
Pratyush Muthukumar, Jie Zhong

Stock price forecasting is a highly complex and vitally important field of research. Recent advancements in deep neural network technology allow researchers to develop highly accurate models to predict financial trends. We propose a novel deep learning model called ST-GAN, or Stochastic Time-series Generative Adversarial Network, that analyzes both financial news texts and financial numerical data to predict stock trends. We utilize cutting-edge technology like the Generative Adversarial Network (GAN) to learn the correlations among textual and numerical data over time. We develop a new method of training a time-series GAN directly using the learned representations of Naive Bayes' sentiment analysis on financial text data alongside technical indicators from numerical data. Our experimental results show significant improvement over various existing models and prior research on deep neural networks for stock price forecasting.

* 16 pages, 7 figures 

  Access Paper or Ask Questions

Explaining NLP Models via Minimal Contrastive Editing (MiCE)

Dec 27, 2020
Alexis Ross, Ana Marasović, Matthew E. Peters

Humans give contrastive explanations that explain why an observed event happened rather than some other counterfactual event (the contrast case). Despite the important role that contrastivity plays in how people generate and evaluate explanations, this property is largely missing from current methods for explaining NLP models. We present Minimal Contrastive Editing (MiCE), a method for generating contrastive explanations of model predictions in the form of edits to inputs that change model outputs to the contrast case. Our experiments across three tasks -- binary sentiment classification, topic classification, and multiple-choice question answering -- show that MiCE is able to produce edits that are not only contrastive, but also minimal and fluent, consistent with human contrastive edits. We demonstrate how MiCE edits can be used for two use cases in NLP system development -- uncovering dataset artifacts and debugging incorrect model predictions -- and thereby illustrate that generating contrastive explanations is a promising research direction for model interpretability.

  Access Paper or Ask Questions