Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Sentiment Analysis": models, code, and papers

Weighed Domain-Invariant Representation Learning for Cross-domain Sentiment Analysis

Sep 18, 2019
Minlong Peng, Qi Zhang, Xuanjing Huang

Cross-domain sentiment analysis is currently a hot topic in the research and engineering areas. One of the most popular frameworks in this field is the domain-invariant representation learning (DIRL) paradigm, which aims to learn a distribution-invariant feature representation across domains. However, in this work, we find out that applying DIRL may harm domain adaptation when the label distribution $\rm{P}(\rm{Y})$ changes across domains. To address this problem, we propose a modification to DIRL, obtaining a novel weighted domain-invariant representation learning (WDIRL) framework. We show that it is easy to transfer existing SOTA DIRL models to WDIRL. Empirical studies on extensive cross-domain sentiment analysis tasks verified our statements and showed the effectiveness of our proposed solution.

* Address the problem of the domain-invariant representation learning framework under target shift 

[email protected]: Sentiment Analysis of Code-Mixed Dravidian text using XLNet

Oct 15, 2020
Shubhanker Banerjee, Arun Jayapal, Sajeetha Thavareesan

Social media has penetrated into multilingual societies, however most of them use English to be a preferred language for communication. So it looks natural for them to mix their cultural language with English during conversations resulting in abundance of multilingual data, call this code-mixed data, available in todays' world.Downstream NLP tasks using such data is challenging due to the semantic nature of it being spread across multiple languages.One such Natural Language Processing task is sentiment analysis, for this we use an auto-regressive XLNet model to perform sentiment analysis on code-mixed Tamil-English and Malayalam-English datasets.

* 7 pages 

Improving Sentiment Analysis in Arabic Using Word Representation

Mar 30, 2018
Abdulaziz M. Alayba, Vasile Palade, Matthew England, Rahat Iqbal

The complexities of Arabic language in morphology, orthography and dialects makes sentiment analysis for Arabic more challenging. Also, text feature extraction from short messages like tweets, in order to gauge the sentiment, makes this task even more difficult. In recent years, deep neural networks were often employed and showed very good results in sentiment classification and natural language processing applications. Word embedding, or word distributing approach, is a current and powerful tool to capture together the closest words from a contextual text. In this paper, we describe how we construct Word2Vec models from a large Arabic corpus obtained from ten newspapers in different Arab countries. By applying different machine learning algorithms and convolutional neural networks with different text feature selections, we report improved accuracy of sentiment classification (91%-95%) on our publicly available Arabic language health sentiment dataset [1]

* Proc. 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR '18), pp. 13-18. IEEE, 2018 
* Authors accepted version of submission for ASAR 2018 

LT3 at SemEval-2020 Task 9: Cross-lingual Embeddings for Sentiment Analysis of Hinglish Social Media Text

Oct 21, 2020
Pranaydeep Singh, Els Lefever

This paper describes our contribution to the SemEval-2020 Task 9 on Sentiment Analysis for Code-mixed Social Media Text. We investigated two approaches to solve the task of Hinglish sentiment analysis. The first approach uses cross-lingual embeddings resulting from projecting Hinglish and pre-trained English FastText word embeddings in the same space. The second approach incorporates pre-trained English embeddings that are incrementally retrained with a set of Hinglish tweets. The results show that the second approach performs best, with an F1-score of 70.52% on the held-out test data.


Enhanced Aspect-Based Sentiment Analysis Models with Progressive Self-supervised Attention Learning

Mar 05, 2021
Jinsong Su, Jialong Tang, Hui Jiang, Ziyao Lu, Yubin Ge, Linfeng Song, Deyi Xiong, Le Sun, Jiebo Luo

In aspect-based sentiment analysis (ABSA), many neural models are equipped with an attention mechanism to quantify the contribution of each context word to sentiment prediction. However, such a mechanism suffers from one drawback: only a few frequent words with sentiment polarities are tended to be taken into consideration for final sentiment decision while abundant infrequent sentiment words are ignored by models. To deal with this issue, we propose a progressive self-supervised attention learning approach for attentional ABSA models. In this approach, we iteratively perform sentiment prediction on all training instances, and continually learn useful attention supervision information in the meantime. During training, at each iteration, context words with the highest impact on sentiment prediction, identified based on their attention weights or gradients, are extracted as words with active/misleading influence on the correct/incorrect prediction for each instance. Words extracted in this way are masked for subsequent iterations. To exploit these extracted words for refining ABSA models, we augment the conventional training objective with a regularization term that encourages ABSA models to not only take full advantage of the extracted active context words but also decrease the weights of those misleading words. We integrate the proposed approach into three state-of-the-art neural ABSA models. Experiment results and in-depth analyses show that our approach yields better attention results and significantly enhances the performance of all three models. We release the source code and trained models at

* Artificial Intelligence 2021 
* 31 pages. arXiv admin note: text overlap with arXiv:1906.01213 

Sentiment Analysis Using Averaged Weighted Word Vector Features

Feb 13, 2020
Ali Erkan, Tunga Gungor

People use the world wide web heavily to share their experience with entities such as products, services, or travel destinations. Texts that provide online feedback in the form of reviews and comments are essential to make consumer decisions. These comments create a valuable source that may be used to measure satisfaction related to products or services. Sentiment analysis is the task of identifying opinions expressed in such text fragments. In this work, we develop two methods that combine different types of word vectors to learn and estimate polarity of reviews. We develop average review vectors from word vectors and add weights to this review vectors using word frequencies in positive and negative sensitivity-tagged reviews. We applied the methods to several datasets from different domains that are used as standard benchmarks for sentiment analysis. We ensemble the techniques with each other and existing methods, and we make a comparison with the approaches in the literature. The results show that the performances of our approaches outperform the state-of-the-art success rates.


COVID-19 sentiment analysis via deep learning during the rise of novel cases

Apr 05, 2021
Rohitash Chandra, Aswin Krishna

Social scientists and psychologists take interest in understanding how people express emotions or sentiments when dealing with catastrophic events such as natural disasters, political unrest, and terrorism. The COVID-19 pandemic is a catastrophic event that has raised a number of psychological issues such as depression given abrupt social changes and lack of employment. During the rise of COVID-19 cases with stricter lock downs, people have been expressing their sentiments in social media which can provide a deep understanding of how people physiologically react to catastrophic events. In this paper, we use deep learning based language models via long short-term memory (LSTM) recurrent neural networks for sentiment analysis on Twitter with a focus of rise of novel cases in India. We use the LSTM model with a global vector (GloVe) for word representation in building a language model. We review the sentiments expressed for selective months covering the major peak of new cases in 2020. We present a framework that focuses on multi-label sentiment classification using LSTM model and GloVe embedding, where more than one sentiment can be expressed at once. Our results show that the majority of the tweets have been positive with high levels of optimism during the rise of the COVID-19 cases in India. We find that the number of tweets significantly lowered towards the peak of new cases. We find that the optimistic and joking tweets mostly dominated the monthly tweets and there was a much lower number of negative sentiments expressed. This could imply that the majority were generally positive and some annoyed towards the way the pandemic was handled by the authorities as their peak was reached.


BERT based sentiment analysis: A software engineering perspective

Jun 28, 2021
Himanshu Batra, Narinder Singh Punn, Sanjay Kumar Sonbhadra, Sonali Agarwal

Sentiment analysis can provide a suitable lead for the tools used in software engineering along with the API recommendation systems and relevant libraries to be used. In this context, the existing tools like SentiCR, SentiStrength-SE, etc. exhibited low f1-scores that completely defeats the purpose of deployment of such strategies, thereby there is enough scope for performance improvement. Recent advancements show that transformer based pre-trained models (e.g., BERT, RoBERTa, ALBERT, etc.) have displayed better results in the text classification task. Following this context, the present research explores different BERT-based models to analyze the sentences in GitHub comments, Jira comments, and Stack Overflow posts. The paper presents three different strategies to analyse BERT based model for sentiment analysis, where in the first strategy the BERT based pre-trained models are fine-tuned; in the second strategy an ensemble model is developed from BERT variants, and in the third strategy a compressed model (Distil BERT) is used. The experimental results show that the BERT based ensemble approach and the compressed BERT model attain improvements by 6-12% over prevailing tools for the F1 measure on all three datasets.


PhonSenticNet: A Cognitive Approach to Microtext Normalization for Concept-Level Sentiment Analysis

Apr 24, 2019
Ranjan Satapathy, Aalind Singh, Erik Cambria

With the current upsurge in the usage of social media platforms, the trend of using short text (microtext) in place of standard words has seen a significant rise. The usage of microtext poses a considerable performance issue in concept-level sentiment analysis, since models are trained on standard words. This paper discusses the impact of coupling sub-symbolic (phonetics) with symbolic (machine learning) Artificial Intelligence to transform the out-of-vocabulary concepts into their standard in-vocabulary form. The phonetic distance is calculated using the Sorensen similarity algorithm. The phonetically similar invocabulary concepts thus obtained are then used to compute the correct polarity value, which was previously being miscalculated because of the presence of microtext. Our proposed framework increases the accuracy of polarity detection by 6% as compared to the earlier model. This also validates the fact that microtext normalization is a necessary pre-requisite for the sentiment analysis task.

* This paper is submitted to INTERSPEECH2019