Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment Analysis": models, code, and papers

An Automatic Contextual Analysis and Clustering Classifiers Ensemble approach to Sentiment Analysis

May 29, 2017
Murtadha Talib AL-Sharuee, Fei Liu, Mahardhika Pratama

Products reviews are one of the major resources to determine the public sentiment. The existing literature on reviews sentiment analysis mainly utilizes supervised paradigm, which needs labeled data to be trained on and suffers from domain-dependency. This article addresses these issues by describes a completely automatic approach for sentiment analysis based on unsupervised ensemble learning. The method consists of two phases. The first phase is contextual analysis, which has five processes, namely (1) data preparation; (2) spelling correction; (3) intensifier handling; (4) negation handling and (5) contrast handling. The second phase comprises the unsupervised learning approach, which is an ensemble of clustering classifiers using a majority voting mechanism with different weight schemes. The base classifier of the ensemble method is a modified k-means algorithm. The base classifier is modified by extracting initial centroids from the feature set via using SentWordNet (SWN). We also introduce new sentiment analysis problems of Australian airlines and home builders which offer potential benchmark problems in the sentiment analysis field. Our experiments on datasets from different domains show that contextual analysis and the ensemble phases improve the clustering performance in term of accuracy, stability and generalization ability.

* This article is submitted to a journal 

  Access Paper or Ask Questions

Text Compression for Sentiment Analysis via Evolutionary Algorithms

Sep 20, 2017
Emmanuel Dufourq, Bruce A. Bassett

Can textual data be compressed intelligently without losing accuracy in evaluating sentiment? In this study, we propose a novel evolutionary compression algorithm, PARSEC (PARts-of-Speech for sEntiment Compression), which makes use of Parts-of-Speech tags to compress text in a way that sacrifices minimal classification accuracy when used in conjunction with sentiment analysis algorithms. An analysis of PARSEC with eight commercial and non-commercial sentiment analysis algorithms on twelve English sentiment data sets reveals that accurate compression is possible with (0%, 1.3%, 3.3%) loss in sentiment classification accuracy for (20%, 50%, 75%) data compression with PARSEC using LingPipe, the most accurate of the sentiment algorithms. Other sentiment analysis algorithms are more severely affected by compression. We conclude that significant compression of text data is possible for sentiment analysis depending on the accuracy demands of the specific application and the specific sentiment analysis algorithm used.

* 8 pages, 2 figures, 8 tables 

  Access Paper or Ask Questions

SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis

May 20, 2020
Hao Tian, Can Gao, Xinyan Xiao, Hao Liu, Bolei He, Hua Wu, Haifeng Wang, Feng Wu

Recently, sentiment analysis has seen remarkable advance with the help of pre-training approaches. However, sentiment knowledge, such as sentiment words and aspect-sentiment pairs, is ignored in the process of pre-training, despite the fact that they are widely used in traditional sentiment analysis approaches. In this paper, we introduce Sentiment Knowledge Enhanced Pre-training (SKEP) in order to learn a unified sentiment representation for multiple sentiment analysis tasks. With the help of automatically-mined knowledge, SKEP conducts sentiment masking and constructs three sentiment knowledge prediction objectives, so as to embed sentiment information at the word, polarity and aspect level into pre-trained sentiment representation. In particular, the prediction of aspect-sentiment pairs is converted into multi-label classification, aiming to capture the dependency between words in a pair. Experiments on three kinds of sentiment tasks show that SKEP significantly outperforms strong pre-training baseline, and achieves new state-of-the-art results on most of the test datasets. We release our code at

* Accepted by ACL2020 

  Access Paper or Ask Questions

Sentiment Analysis on Speaker Specific Speech Data

Feb 17, 2018
Maghilnan S, Rajesh Kumar M

Sentiment analysis has evolved over past few decades, most of the work in it revolved around textual sentiment analysis with text mining techniques. But audio sentiment analysis is still in a nascent stage in the research community. In this proposed research, we perform sentiment analysis on speaker discriminated speech transcripts to detect the emotions of the individual speakers involved in the conversation. We analyzed different techniques to perform speaker discrimination and sentiment analysis to find efficient algorithms to perform this task.

* Accepted and Published in 2017 IEEE International Conference on Intelligent Computing and Control (I2C2), 23 Jun - 24 Jun 2017, India 

  Access Paper or Ask Questions

Deriving Emotions and Sentiments from Visual Content: A Disaster Analysis Use Case

Feb 03, 2020
Kashif Ahmad, Syed Zohaib, Nicola Conci, Ala Al-Fuqaha

Sentiment analysis aims to extract and express a person's perception, opinions and emotions towards an entity, object, product and a service, enabling businesses to obtain feedback from the consumers. The increasing popularity of the social networks and users' tendency towards sharing their feelings, expressions and opinions in text, visual and audio content has opened new opportunities and challenges in sentiment analysis. While sentiment analysis of text streams has been widely explored in the literature, sentiment analysis of images and videos is relatively new. This article introduces visual sentiment analysis and contrasts it with textual sentiment analysis with emphasis on the opportunities and challenges in this nascent research area. We also propose a deep visual sentiment analyzer for disaster-related images as a use-case, covering different aspects of visual sentiment analysis starting from data collection, annotation, model selection, implementation and evaluations. We believe such rigorous analysis will provide a baseline for future research in the domain.

  Access Paper or Ask Questions

Emotion helps Sentiment: A Multi-task Model for Sentiment and Emotion Analysis

Nov 28, 2019
Abhishek Kumar, Asif Ekbal, Daisuke Kawahra, Sadao Kurohashi

In this paper, we propose a two-layered multi-task attention based neural network that performs sentiment analysis through emotion analysis. The proposed approach is based on Bidirectional Long Short-Term Memory and uses Distributional Thesaurus as a source of external knowledge to improve the sentiment and emotion prediction. The proposed system has two levels of attention to hierarchically build a meaningful representation. We evaluate our system on the benchmark dataset of SemEval 2016 Task 6 and also compare it with the state-of-the-art systems on Stance Sentiment Emotion Corpus. Experimental results show that the proposed system improves the performance of sentiment analysis by 3.2 F-score points on SemEval 2016 Task 6 dataset. Our network also boosts the performance of emotion analysis by 5 F-score points on Stance Sentiment Emotion Corpus.

* Accepted in the Proceedings of The 2019 IEEE International Joint Conference on Neural Networks (IJCNN 2019) 

  Access Paper or Ask Questions

A Scalable, Lexicon Based Technique for Sentiment Analysis

Oct 08, 2014
Chetan Kaushik, Atul Mishra

Rapid increase in the volume of sentiment rich social media on the web has resulted in an increased interest among researchers regarding Sentimental Analysis and opinion mining. However, with so much social media available on the web, sentiment analysis is now considered as a big data task. Hence the conventional sentiment analysis approaches fails to efficiently handle the vast amount of sentiment data available now a days. The main focus of the research was to find such a technique that can efficiently perform sentiment analysis on big data sets. A technique that can categorize the text as positive, negative and neutral in a fast and accurate manner. In the research, sentiment analysis was performed on a large data set of tweets using Hadoop and the performance of the technique was measured in form of speed and accuracy. The experimental results shows that the technique exhibits very good efficiency in handling big sentiment data sets.

* International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.5, September 2014 
* 9 pages 1 figure 2 tables 

  Access Paper or Ask Questions

A Survey on sentiment analysis in Persian: A Comprehensive System Perspective Covering Challenges and Advances in Resources, and Methods

Apr 30, 2021
Zeinab Rajabi, MohammadReza Valavi

Social media has been remarkably grown during the past few years. Nowadays, posting messages on social media websites has become one of the most popular Internet activities. The vast amount of user-generated content has made social media the most extensive data source of public opinion. Sentiment analysis is one of the techniques used to analyze user-generated data. The Persian language has specific features and thereby requires unique methods and models to be adopted for sentiment analysis, which are different from those in English language. Sentiment analysis in each language has specified prerequisites; hence, the direct use of methods, tools, and resources developed for English language in Persian has its limitations. The main target of this paper is to provide a comprehensive literature survey for state-of-the-art advances in Persian sentiment analysis. In this regard, the present study aims to investigate and compare the previous sentiment analysis studies on Persian texts and describe contributions presented in articles published in the last decade. First, the levels, approaches, and tasks for sentiment analysis are described. Then, a detailed survey of the sentiment analysis methods used for Persian texts is presented, and previous relevant works on Persian Language are discussed. Moreover, we present in this survey the authentic and published standard sentiment analysis resources and advances that have been done for Persian sentiment analysis. Finally, according to the state-of-the-art development of English sentiment analysis, some issues and challenges not being addressed in Persian texts are listed, and some guidelines and trends are provided for future research on Persian texts. The paper provides information to help new or established researchers in the field as well as industry developers who aim to deploy an operational complete sentiment analysis system.

* 31 pages, 2 figures, tables 5 

  Access Paper or Ask Questions

Identification of Bias Against People with Disabilities in Sentiment Analysis and Toxicity Detection Models

Nov 25, 2021
Pranav Narayanan Venkit, Shomir Wilson

Sociodemographic biases are a common problem for natural language processing, affecting the fairness and integrity of its applications. Within sentiment analysis, these biases may undermine sentiment predictions for texts that mention personal attributes that unbiased human readers would consider neutral. Such discrimination can have great consequences in the applications of sentiment analysis both in the public and private sectors. For example, incorrect inferences in applications like online abuse and opinion analysis in social media platforms can lead to unwanted ramifications, such as wrongful censoring, towards certain populations. In this paper, we address the discrimination against people with disabilities, PWD, done by sentiment analysis and toxicity classification models. We provide an examination of sentiment and toxicity analysis models to understand in detail how they discriminate PWD. We present the Bias Identification Test in Sentiments (BITS), a corpus of 1,126 sentences designed to probe sentiment analysis models for biases in disability. We use this corpus to demonstrate statistically significant biases in four widely used sentiment analysis tools (TextBlob, VADER, Google Cloud Natural Language API and DistilBERT) and two toxicity analysis models trained to predict toxic comments on Jigsaw challenges (Toxic comment classification and Unintended Bias in Toxic comments). The results show that all exhibit strong negative biases on sentences that mention disability. We publicly release BITS Corpus for others to identify potential biases against disability in any sentiment analysis tools and also to update the corpus to be used as a test for other sociodemographic variables as well.

  Access Paper or Ask Questions

The emojification of sentiment on social media: Collection and analysis of a longitudinal Twitter sentiment dataset

Aug 31, 2021
Wenjie Yin, Rabab Alkhalifa, Arkaitz Zubiaga

Social media, as a means for computer-mediated communication, has been extensively used to study the sentiment expressed by users around events or topics. There is however a gap in the longitudinal study of how sentiment evolved in social media over the years. To fill this gap, we develop TM-Senti, a new large-scale, distantly supervised Twitter sentiment dataset with over 184 million tweets and covering a time period of over seven years. We describe and assess our methodology to put together a large-scale, emoticon- and emoji-based labelled sentiment analysis dataset, along with an analysis of the resulting dataset. Our analysis highlights interesting temporal changes, among others in the increasing use of emojis over emoticons. We publicly release the dataset for further research in tasks including sentiment analysis and text classification of tweets. The dataset can be fully rehydrated including tweet metadata and without missing tweets thanks to the archive of tweets publicly available on the Internet Archive, which the dataset is based on.

  Access Paper or Ask Questions

Visual Sentiment Analysis from Disaster Images in Social Media

Sep 04, 2020
Syed Zohaib Hassan, Kashif Ahmad, Steven Hicks, Paal Halvorsen, Ala Al-Fuqaha, Nicola Conci, Michael Riegler

The increasing popularity of social networks and users' tendency towards sharing their feelings, expressions, and opinions in text, visual, and audio content, have opened new opportunities and challenges in sentiment analysis. While sentiment analysis of text streams has been widely explored in literature, sentiment analysis from images and videos is relatively new. This article focuses on visual sentiment analysis in a societal important domain, namely disaster analysis in social media. To this aim, we propose a deep visual sentiment analyzer for disaster related images, covering different aspects of visual sentiment analysis starting from data collection, annotation, model selection, implementation, and evaluations. For data annotation, and analyzing peoples' sentiments towards natural disasters and associated images in social media, a crowd-sourcing study has been conducted with a large number of participants worldwide. The crowd-sourcing study resulted in a large-scale benchmark dataset with four different sets of annotations, each aiming a separate task. The presented analysis and the associated dataset will provide a baseline/benchmark for future research in the domain. We believe the proposed system can contribute toward more livable communities by helping different stakeholders, such as news broadcasters, humanitarian organizations, as well as the general public.

* 10 pages, 6 figures, 6 tables. arXiv admin note: substantial text overlap with arXiv:2002.03773 

  Access Paper or Ask Questions

Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks

Sep 20, 2015
Quanzeng You, Jiebo Luo, Hailin Jin, Jianchao Yang

Sentiment analysis of online user generated content is important for many social media analytics tasks. Researchers have largely relied on textual sentiment analysis to develop systems to predict political elections, measure economic indicators, and so on. Recently, social media users are increasingly using images and videos to express their opinions and share their experiences. Sentiment analysis of such large scale visual content can help better extract user sentiments toward events or topics, such as those in image tweets, so that prediction of sentiment from visual content is complementary to textual sentiment analysis. Motivated by the needs in leveraging large scale yet noisy training data to solve the extremely challenging problem of image sentiment analysis, we employ Convolutional Neural Networks (CNN). We first design a suitable CNN architecture for image sentiment analysis. We obtain half a million training samples by using a baseline sentiment algorithm to label Flickr images. To make use of such noisy machine labeled data, we employ a progressive strategy to fine-tune the deep network. Furthermore, we improve the performance on Twitter images by inducing domain transfer with a small number of manually labeled Twitter images. We have conducted extensive experiments on manually labeled Twitter images. The results show that the proposed CNN can achieve better performance in image sentiment analysis than competing algorithms.

* 9 pages, 5 figures, AAAI 2015 

  Access Paper or Ask Questions

Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM

Aug 07, 2018
Yuxiao Chen, Jianbo Yuan, Quanzeng You, Jiebo Luo

Sentiment analysis on large-scale social media data is important to bridge the gaps between social media contents and real world activities including political election prediction, individual and public emotional status monitoring and analysis, and so on. Although textual sentiment analysis has been well studied based on platforms such as Twitter and Instagram, analysis of the role of extensive emoji uses in sentiment analysis remains light. In this paper, we propose a novel scheme for Twitter sentiment analysis with extra attention on emojis. We first learn bi-sense emoji embeddings under positive and negative sentimental tweets individually, and then train a sentiment classifier by attending on these bi-sense emoji embeddings with an attention-based long short-term memory network (LSTM). Our experiments show that the bi-sense embedding is effective for extracting sentiment-aware embeddings of emojis and outperforms the state-of-the-art models. We also visualize the attentions to show that the bi-sense emoji embedding provides better guidance on the attention mechanism to obtain a more robust understanding of the semantics and sentiments.

  Access Paper or Ask Questions

Sentiment Analysis by Using Fuzzy Logic

Mar 13, 2014
Md. Ansarul Haque

How could a product or service is reasonably evaluated by anyone in the shortest time? A million dollar question but it is having a simple answer: Sentiment analysis. Sentiment analysis is consumers review on products and services which helps both the producers and consumers (stakeholders) to take effective and efficient decision within a shortest period of time. Producers can have better knowledge of their products and services through the sentiment analysis (ex. positive and negative comments or consumers likes and dislikes) which will help them to know their products status (ex. product limitations or market status). Consumers can have better knowledge of their interested products and services through the sentiment analysis (ex. positive and negative comments or consumers likes and dislikes) which will help them to know their deserving products status (ex. product limitations or market status). For more specification of the sentiment values, fuzzy logic could be introduced. Therefore, sentiment analysis with the help of fuzzy logic (deals with reasoning and gives closer views to the exact sentiment values) will help the producers or consumers or any interested person for taking the effective decision according to their product or service interest.

* 16 pages., February 2014 

  Access Paper or Ask Questions

Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning

Feb 03, 2018
Minghai Chen, Sen Wang, Paul Pu Liang, Tadas Baltrušaitis, Amir Zadeh, Louis-Philippe Morency

With the increasing popularity of video sharing websites such as YouTube and Facebook, multimodal sentiment analysis has received increasing attention from the scientific community. Contrary to previous works in multimodal sentiment analysis which focus on holistic information in speech segments such as bag of words representations and average facial expression intensity, we develop a novel deep architecture for multimodal sentiment analysis that performs modality fusion at the word level. In this paper, we propose the Gated Multimodal Embedding LSTM with Temporal Attention (GME-LSTM(A)) model that is composed of 2 modules. The Gated Multimodal Embedding alleviates the difficulties of fusion when there are noisy modalities. The LSTM with Temporal Attention performs word level fusion at a finer fusion resolution between input modalities and attends to the most important time steps. As a result, the GME-LSTM(A) is able to better model the multimodal structure of speech through time and perform better sentiment comprehension. We demonstrate the effectiveness of this approach on the publicly-available Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis (CMU-MOSI) dataset by achieving state-of-the-art sentiment classification and regression results. Qualitative analysis on our model emphasizes the importance of the Temporal Attention Layer in sentiment prediction because the additional acoustic and visual modalities are noisy. We also demonstrate the effectiveness of the Gated Multimodal Embedding in selectively filtering these noisy modalities out. Our results and analysis open new areas in the study of sentiment analysis in human communication and provide new models for multimodal fusion.

* ICMI 2017 Oral Presentation, Honorable Mention Award 

  Access Paper or Ask Questions

The Evolution of Sentiment Analysis - A Review of Research Topics, Venues, and Top Cited Papers

Nov 21, 2017
Mika Viking Mäntylä, Daniel Graziotin, Miikka Kuutila

Sentiment analysis is one of the fastest growing research areas in computer science, making it challenging to keep track of all the activities in the area. We present a computer-assisted literature review, where we utilize both text mining and qualitative coding, and analyze 6,996 papers from Scopus. We find that the roots of sentiment analysis are in the studies on public opinion analysis at the beginning of 20th century and in the text subjectivity analysis performed by the computational linguistics community in 1990's. However, the outbreak of computer-based sentiment analysis only occurred with the availability of subjective texts on the Web. Consequently, 99% of the papers have been published after 2004. Sentiment analysis papers are scattered to multiple publication venues, and the combined number of papers in the top-15 venues only represent ca. 30% of the papers in total. We present the top-20 cited papers from Google Scholar and Scopus and a taxonomy of research topics. In recent years, sentiment analysis has shifted from analyzing online product reviews to social media texts from Twitter and Facebook. Many topics beyond product reviews like stock markets, elections, disasters, medicine, software engineering and cyberbullying extend the utilization of sentiment analysis

* Computer Science Review, Volume 27, February 2018, Pages 16-32 
* 29 pages, 14 figures 

  Access Paper or Ask Questions

BiERU: Bidirectional Emotional Recurrent Unit for Conversational Sentiment Analysis

May 31, 2020
Wei Li, Wei Shao, Shaoxiong Ji, Erik Cambria

Sentiment analysis in conversations has gained increasing attention in recent years for the growing amount of applications it can serve, e.g., sentiment analysis, recommender systems, and human-robot interaction. The main difference between conversational sentiment analysis and single sentence sentiment analysis is the existence of context information which may influence the sentiment of an utterance in a dialogue. How to effectively encode contextual information in dialogues, however, remains a challenge. Existing approaches employ complicated deep learning structures to distinguish different parties in a conversation and then model the context information. In this paper, we propose a fast, compact and parameter-efficient party-ignorant framework named bidirectional emotional recurrent unit for conversational sentiment analysis. In our system, a generalized neural tensor block followed by a two-channel classifier is designed to perform context compositionality and sentiment classification, respectively. Extensive experiments on three standard datasets demonstrate that our model outperforms the state of the art in most cases.

* 9 pages, 7 figures 

  Access Paper or Ask Questions

Tweets Sentiment Analysis via Word Embeddings and Machine Learning Techniques

Jul 05, 2020
Aditya Sharma, Alex Daniels

Sentiment analysis of social media data consists of attitudes, assessments, and emotions which can be considered a way human think. Understanding and classifying the large collection of documents into positive and negative aspects are a very difficult task. Social networks such as Twitter, Facebook, and Instagram provide a platform in order to gather information about peoples sentiments and opinions. Considering the fact that people spend hours daily on social media and share their opinion on various different topics helps us analyze sentiments better. More and more companies are using social media tools to provide various services and interact with customers. Sentiment Analysis (SA) classifies the polarity of given tweets to positive and negative tweets in order to understand the sentiments of the public. This paper aims to perform sentiment analysis of real-time 2019 election twitter data using the feature selection model word2vec and the machine learning algorithm random forest for sentiment classification. Word2vec with Random Forest improves the accuracy of sentiment analysis significantly compared to traditional methods such as BOW and TF-IDF. Word2vec improves the quality of features by considering contextual semantics of words in a text hence improving the accuracy of machine learning and sentiment analysis.

  Access Paper or Ask Questions

Sentiment analysis for Arabic language: A brief survey of approaches and techniques

Sep 15, 2018
Mo'ath Alrefai, Hossam Faris, Ibrahim Aljarah

With the emergence of Web 2.0 technology and the expansion of on-line social networks, current Internet users have the ability to add their reviews, ratings and opinions on social media and on commercial and news web sites. Sentiment analysis aims to classify these reviews reviews in an automatic way. In the literature, there are numerous approaches proposed for automatic sentiment analysis for different language contexts. Each language has its own properties that makes the sentiment analysis more challenging. In this regard, this work presents a comprehensive survey of existing Arabic sentiment analysis studies, and covers the various approaches and techniques proposed in the literature. Moreover, we highlight the main difficulties and challenges of Arabic sentiment analysis, and the proposed techniques in literature to overcome these barriers.

  Access Paper or Ask Questions