Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Assessing Emoji Use in Modern Text Processing Tools

Jan 02, 2021
Abu Awal Md Shoeb, Gerard de Melo

Emojis have become ubiquitous in digital communication, due to their visual appeal as well as their ability to vividly convey human emotion, among other factors. The growing prominence of emojis in social media and other instant messaging also leads to an increased need for systems and tools to operate on text containing emojis. In this study, we assess this support by considering test sets of tweets with emojis, based on which we perform a series of experiments investigating the ability of prominent NLP and text processing tools to adequately process them. In particular, we consider tokenization, part-of-speech tagging, as well as sentiment analysis. Our findings show that many tools still have notable shortcomings when operating on text containing emojis.

  Access Paper or Ask Questions

A Simple and Effective Approach for Fine Tuning Pre-trained Word Embeddings for Improved Text Classification

Aug 07, 2019
Amr Al-Khatib, Samhaa R. El-Beltagy

This work presents a new and simple approach for fine-tuning pretrained word embeddings for text classification tasks. In this approach, the class in which a term appears, acts as an additional contextual variable during the fine tuning process, and contributes to the final word vector for that term. As a result, words that are used distinctively within a particular class, will bear vectors that are closer to each other in the embedding space and will be more discriminative towards that class. To validate this novel approach, it was applied to three Arabic and two English datasets that have been previously used for text classification tasks such as sentiment analysis and emotion detection. In the vast majority of cases, the results obtained using the proposed approach, improved considerably.

  Access Paper or Ask Questions

Using Deep Networks and Transfer Learning to Address Disinformation

May 24, 2019
Numa Dhamani, Paul Azunre, Jeffrey L. Gleason, Craig Corcoran, Garrett Honke, Steve Kramer, Jonathon Morgan

We apply an ensemble pipeline composed of a character-level convolutional neural network (CNN) and a long short-term memory (LSTM) as a general tool for addressing a range of disinformation problems. We also demonstrate the ability to use this architecture to transfer knowledge from labeled data in one domain to related (supervised and unsupervised) tasks. Character-level neural networks and transfer learning are particularly valuable tools in the disinformation space because of the messy nature of social media, lack of labeled data, and the multi-channel tactics of influence campaigns. We demonstrate their effectiveness in several tasks relevant for detecting disinformation: spam emails, review bombing, political sentiment, and conversation clustering.

* AI for Social Good Workshop at the International Conference on Machine Learning, Long Beach, United States (2019) 

  Access Paper or Ask Questions

Subspace Clustering of Very Sparse High-Dimensional Data

Jan 25, 2019
Hankui Peng, Nicos Pavlidis, Idris Eckley, Ioannis Tsalamanis

In this paper we consider the problem of clustering collections of very short texts using subspace clustering. This problem arises in many applications such as product categorisation, fraud detection, and sentiment analysis. The main challenge lies in the fact that the vectorial representation of short texts is both high-dimensional, due to the large number of unique terms in the corpus, and extremely sparse, as each text contains a very small number of words with no repetition. We propose a new, simple subspace clustering algorithm that relies on linear algebra to cluster such datasets. Experimental results on identifying product categories from product names obtained from the US Amazon website indicate that the algorithm can be competitive against state-of-the-art clustering algorithms.

* 2018 IEEE International Conference on Big Data 

  Access Paper or Ask Questions

Multiple-Source Adaptation for Regression Problems

Nov 14, 2017
Judy Hoffman, Mehryar Mohri, Ningshan Zhang

We present a detailed theoretical analysis of the problem of multiple-source adaptation in the general stochastic scenario, extending known results that assume a single target labeling function. Our results cover a more realistic scenario and show the existence of a single robust predictor accurate for \emph{any} target mixture of the source distributions. Moreover, we present an efficient and practical optimization solution to determine the robust predictor in the important case of squared loss, by casting the problem as an instance of DC-programming. We report the results of experiments with both an artificial task and a sentiment analysis task. We find that our algorithm outperforms competing approaches by producing a single robust model that performs well on any target mixture distribution.

  Access Paper or Ask Questions

AC-BLSTM: Asymmetric Convolutional Bidirectional LSTM Networks for Text Classification

Jun 05, 2017
Depeng Liang, Yongdong Zhang

Recently deeplearning models have been shown to be capable of making remarkable performance in sentences and documents classification tasks. In this work, we propose a novel framework called AC-BLSTM for modeling sentences and documents, which combines the asymmetric convolution neural network (ACNN) with the Bidirectional Long Short-Term Memory network (BLSTM). Experiment results demonstrate that our model achieves state-of-the-art results on five tasks, including sentiment analysis, question type classification, and subjectivity classification. In order to further improve the performance of AC-BLSTM, we propose a semi-supervised learning framework called G-AC-BLSTM for text classification by combining the generative model with AC-BLSTM.

* 9 pages 

  Access Paper or Ask Questions

Rationalizing Neural Predictions

Nov 02, 2016
Tao Lei, Regina Barzilay, Tommi Jaakkola

Prediction without justification has limited applicability. As a remedy, we learn to extract pieces of input text as justifications -- rationales -- that are tailored to be short and coherent, yet sufficient for making the same prediction. Our approach combines two modular components, generator and encoder, which are trained to operate well together. The generator specifies a distribution over text fragments as candidate rationales and these are passed through the encoder for prediction. Rationales are never given during training. Instead, the model is regularized by desiderata for rationales. We evaluate the approach on multi-aspect sentiment analysis against manually annotated test cases. Our approach outperforms attention-based baseline by a significant margin. We also successfully illustrate the method on the question retrieval task.

* EMNLP 2016 

  Access Paper or Ask Questions

Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game

Jun 15, 2015
Vlad Niculae, Srijan Kumar, Jordan Boyd-Graber, Cristian Danescu-Niculescu-Mizil

Interpersonal relations are fickle, with close friendships often dissolving into enmity. In this work, we explore linguistic cues that presage such transitions by studying dyadic interactions in an online strategy game where players form alliances and break those alliances through betrayal. We characterize friendships that are unlikely to last and examine temporal patterns that foretell betrayal. We reveal that subtle signs of imminent betrayal are encoded in the conversational patterns of the dyad, even if the victim is not aware of the relationship's fate. In particular, we find that lasting friendships exhibit a form of balance that manifests itself through language. In contrast, sudden changes in the balance of certain conversational attributes---such as positive sentiment, politeness, or focus on future planning---signal impending betrayal.

* To appear at ACL 2015. 10pp, 4 fig. Data and other info available at 

  Access Paper or Ask Questions

Conversational Analysis of Daily Dialog Data using Polite Emotional Dialogue Acts

May 11, 2022
Chandrakant Bothe, Stefan Wermter

Many socio-linguistic cues are used in conversational analysis, such as emotion, sentiment, and dialogue acts. One of the fundamental cues is politeness, which linguistically possesses properties such as social manners useful in conversational analysis. This article presents findings of polite emotional dialogue act associations, where we can correlate the relationships between the socio-linguistic cues. We confirm our hypothesis that the utterances with the emotion classes Anger and Disgust are more likely to be impolite. At the same time, Happiness and Sadness are more likely to be polite. A less expectable phenomenon occurs with dialogue acts Inform and Commissive which contain more polite utterances than Question and Directive. Finally, we conclude on the future work of these findings to extend the learning of social behaviours using politeness.

* Accepted at LREC 2022 (pre-print). arXiv admin note: substantial text overlap with arXiv:2112.13572 

  Access Paper or Ask Questions

Modular Domain Adaptation

Apr 26, 2022
Junshen K. Chen, Dallas Card, Dan Jurafsky

Off-the-shelf models are widely used by computational social science researchers to measure properties of text, such as sentiment. However, without access to source data it is difficult to account for domain shift, which represents a threat to validity. Here, we treat domain adaptation as a modular process that involves separate model producers and model consumers, and show how they can independently cooperate to facilitate more accurate measurements of text. We introduce two lightweight techniques for this scenario, and demonstrate that they reliably increase out-of-domain accuracy on four multi-domain text classification datasets when used with linear and contextual embedding models. We conclude with recommendations for model producers and consumers, and release models and replication code to accompany this paper.

* Findings of ACL (2022) 

  Access Paper or Ask Questions