Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Sentiment Analysis": models, code, and papers

Multilevel sentiment analysis in arabic

May 24, 2022
Ahmed Nassar, Ebru Sezer

In this study, we aimed to improve the performance results of Arabic sentiment analysis. This can be achieved by investigating the most successful machine learning method and the most useful feature vector to classify sentiments in both term and document levels into two (positive or negative) categories. Moreover, specification of one polarity degree for the term that has more than one is investigated. Also to handle the negations and intensifications, some rules are developed. According to the obtained results, Artificial Neural Network classifier is nominated as the best classifier in both term and document level sentiment analysis (SA) for Arabic Language. Furthermore, the average F-score achieved in the term level SA for both positive and negative testing classes is 0.92. In the document level SA, the average F-score for positive testing classes is 0.94, while for negative classes is 0.93.

* 10 pages, 3 figures, Published in: 2019 IEEE 7th Palestinian International Conference on Electrical and Computer Engineering (PICECE), Date of Conference: 26-27 March 2019 
Access Paper or Ask Questions

Efficient Twitter Sentiment Classification using Subjective Distant Supervision

Jan 11, 2017
Tapan Sahni, Chinmay Chandak, Naveen Reddy Chedeti, Manish Singh

As microblogging services like Twitter are becoming more and more influential in today's globalised world, its facets like sentiment analysis are being extensively studied. We are no longer constrained by our own opinion. Others opinions and sentiments play a huge role in shaping our perspective. In this paper, we build on previous works on Twitter sentiment analysis using Distant Supervision. The existing approach requires huge computation resource for analysing large number of tweets. In this paper, we propose techniques to speed up the computation process for sentiment analysis. We use tweet subjectivity to select the right training samples. We also introduce the concept of EFWS (Effective Word Score) of a tweet that is derived from polarity scores of frequently used words, which is an additional heuristic that can be used to speed up the sentiment classification with standard machine learning algorithms. We performed our experiments using 1.6 million tweets. Experimental evaluations show that our proposed technique is more efficient and has higher accuracy compared to previously proposed methods. We achieve overall accuracies of around 80% (EFWS heuristic gives an accuracy around 85%) on a training dataset of 100K tweets, which is half the size of the dataset used for the baseline model. The accuracy of our proposed model is 2-3% higher than the baseline model, and the model effectively trains at twice the speed of the baseline model.

Access Paper or Ask Questions

Leveraging Cognitive Features for Sentiment Analysis

Jan 19, 2017
Abhijit Mishra, Diptesh Kanojia, Seema Nagar, Kuntal Dey, Pushpak Bhattacharyya

Sentiments expressed in user-generated short text and sentences are nuanced by subtleties at lexical, syntactic, semantic and pragmatic levels. To address this, we propose to augment traditional features used for sentiment analysis and sarcasm detection, with cognitive features derived from the eye-movement patterns of readers. Statistical classification using our enhanced feature set improves the performance (F-score) of polarity detection by a maximum of 3.7% and 9.3% on two datasets, over the systems that use only traditional features. We perform feature significance analysis, and experiment on a held-out dataset, showing that cognitive features indeed empower sentiment analyzers to handle complex constructs.

* The SIGNLL Conference on Computational Natural Language Learning (CoNLL 2016) 
Access Paper or Ask Questions

Sentiment Analysis for Reinforcement Learning

Oct 05, 2020
Ameet Deshpande, Eve Fleisig

While reinforcement learning (RL) has been successful in natural language processing (NLP) domains such as dialogue generation and text-based games, it typically faces the problem of sparse rewards that leads to slow or no convergence. Traditional methods that use text descriptions to extract only a state representation ignore the feedback inherently present in them. In text-based games, for example, descriptions like "Good Job! You ate the food}" indicate progress, and descriptions like "You entered a new room" indicate exploration. Positive and negative cues like these can be converted to rewards through sentiment analysis. This technique converts the sparse reward problem into a dense one, which is easier to solve. Furthermore, this can enable reinforcement learning without rewards, in which the agent learns entirely from these intrinsic sentiment rewards. This framework is similar to intrinsic motivation, where the environment does not necessarily provide the rewards, but the agent analyzes and realizes them by itself. We find that providing dense rewards in text-based games using sentiment analysis improves performance under some conditions.

* Work in progress 
Access Paper or Ask Questions

Indonesian Social Media Sentiment Analysis With Sarcasm Detection

May 12, 2015
Edwin Lunando, Ayu Purwarianti

Sarcasm is considered one of the most difficult problem in sentiment analysis. In our ob-servation on Indonesian social media, for cer-tain topics, people tend to criticize something using sarcasm. Here, we proposed two additional features to detect sarcasm after a common sentiment analysis is conducted. The features are the negativity information and the number of interjection words. We also employed translated SentiWordNet in the sentiment classification. All the classifications were conducted with machine learning algorithms. The experimental results showed that the additional features are quite effective in the sarcasm detection.

* 4 pages; 3 figures 
Access Paper or Ask Questions

Sentiment Polarity Detection for Software Development

Sep 25, 2017
Fabio Calefato, Filippo Lanubile, Federico Maiorano, Nicole Novielli

The role of sentiment analysis is increasingly emerging to study software developers' emotions by mining crowd-generated content within social software engineering tools. However, off-the-shelf sentiment analysis tools have been trained on non-technical domains and general-purpose social media, thus resulting in misclassifications of technical jargon and problem reports. Here, we present Senti4SD, a classifier specifically trained to support sentiment analysis in developers' communication channels. Senti4SD is trained and validated using a gold standard of Stack Overflow questions, answers, and comments manually annotated for sentiment polarity. It exploits a suite of both lexicon- and keyword-based features, as well as semantic features based on word embedding. With respect to a mainstream off-the-shelf tool, which we use as a baseline, Senti4SD reduces the misclassifications of neutral and positive posts as emotionally negative. To encourage replications, we release a lab package including the classifier, the word embedding space, and the gold standard with annotation guidelines.

* Empirical Software Engineering, June 2018, Volume 23, Issue 3, pp 1352 - 1382 
* Cite as: Calefato, F., Lanubile, F., Maiorano, F., Novielli N. Empir Software Eng (2017). Full-text view-only version here:, Empir Software Eng (2017) 
Access Paper or Ask Questions

Performance Investigation of Feature Selection Methods

Sep 16, 2013
Anuj sharma, Shubhamoy Dey

Sentiment analysis or opinion mining has become an open research domain after proliferation of Internet and Web 2.0 social media. People express their attitudes and opinions on social media including blogs, discussion forums, tweets, etc. and, sentiment analysis concerns about detecting and extracting sentiment or opinion from online text. Sentiment based text classification is different from topical text classification since it involves discrimination based on expressed opinion on a topic. Feature selection is significant for sentiment analysis as the opinionated text may have high dimensions, which can adversely affect the performance of sentiment analysis classifier. This paper explores applicability of feature selection methods for sentiment analysis and investigates their performance for classification in term of recall, precision and accuracy. Five feature selection methods (Document Frequency, Information Gain, Gain Ratio, Chi Squared, and Relief-F) and three popular sentiment feature lexicons (HM, GI and Opinion Lexicon) are investigated on movie reviews corpus with a size of 2000 documents. The experimental results show that Information Gain gave consistent results and Gain Ratio performs overall best for sentimental feature selection while sentiment lexicons gave poor performance. Furthermore, we found that performance of the classifier depends on appropriate number of representative feature selected from text.

* 6 pages 
Access Paper or Ask Questions

A Statistical Parsing Framework for Sentiment Classification

Mar 05, 2015
Li Dong, Furu Wei, Shujie Liu, Ming Zhou, Ke Xu

We present a statistical parsing framework for sentence-level sentiment classification in this article. Unlike previous works that employ syntactic parsing results for sentiment analysis, we develop a statistical parser to directly analyze the sentiment structure of a sentence. We show that complicated phenomena in sentiment analysis (e.g., negation, intensification, and contrast) can be handled the same as simple and straightforward sentiment expressions in a unified and probabilistic way. We formulate the sentiment grammar upon Context-Free Grammars (CFGs), and provide a formal description of the sentiment parsing framework. We develop the parsing model to obtain possible sentiment parse trees for a sentence, from which the polarity model is proposed to derive the sentiment strength and polarity, and the ranking model is dedicated to selecting the best sentiment tree. We train the parser directly from examples of sentences annotated only with sentiment polarity labels but without any syntactic annotations or polarity annotations of constituents within sentences. Therefore we can obtain training data easily. In particular, we train a sentiment parser, s.parser, from a large amount of review sentences with users' ratings as rough sentiment polarity labels. Extensive experiments on existing benchmark datasets show significant improvements over baseline sentiment classification approaches.

* Accepted by Computational Linguistics 
Access Paper or Ask Questions

Article citation study: Context enhanced citation sentiment detection

May 10, 2020
Vishal Vyas, Kumar Ravi, Vadlamani Ravi, V. Uma, Srirangaraj Setlur, Venu Govindaraju

Citation sentimet analysis is one of the little studied tasks for scientometric analysis. For citation analysis, we developed eight datasets comprising citation sentences, which are manually annotated by us into three sentiment polarities viz. positive, negative, and neutral. Among eight datasets, three were developed by considering the whole context of citations. Furthermore, we proposed an ensembled feature engineering method comprising word embeddings obtained for texts, parts-of-speech tags, and dependency relationships together. Ensembled features were considered as input to deep learning based approaches for citation sentiment classification, which is in turn compared with Bag-of-Words approach. Experimental results demonstrate that deep learning is useful for higher number of samples, whereas support vector machine is the winner for smaller number of samples. Moreover, context-based samples are proved to be more effective than context-less samples for citation sentiment analysis.

* 39 pages, 12 Tables, 5 Figures, Journal Paper 
Access Paper or Ask Questions

Better Document-level Sentiment Analysis from RST Discourse Parsing

Sep 11, 2015
Parminder Bhatia, Yangfeng Ji, Jacob Eisenstein

Discourse structure is the hidden link between surface features and document-level properties, such as sentiment polarity. We show that the discourse analyses produced by Rhetorical Structure Theory (RST) parsers can improve document-level sentiment analysis, via composition of local information up the discourse tree. First, we show that reweighting discourse units according to their position in a dependency representation of the rhetorical structure can yield substantial improvements on lexicon-based sentiment analysis. Next, we present a recursive neural network over the RST structure, which offers significant improvements over classification-based methods.

* Published at Empirical Methods in Natural Language Processing (EMNLP 2015) 
Access Paper or Ask Questions