Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Sentiment Analysis": models, code, and papers

Country Image in COVID-19 Pandemic: A Case Study of China

Sep 12, 2020
Huimin Chen, Zeyu Zhu, Fanchao Qi, Yining Ye, Zhiyuan Liu, Maosong Sun, Jianbin Jin

Country image has a profound influence on international relations and economic development. In the worldwide outbreak of COVID-19, countries and their people display different reactions, resulting in diverse perceived images among foreign public. Therefore, in this study, we take China as a specific and typical case and investigate its image with aspect-based sentiment analysis on a large-scale Twitter dataset. To our knowledge, this is the first study to explore country image in such a fine-grained way. To perform the analysis, we first build a manually-labeled Twitter dataset with aspect-level sentiment annotations. Afterward, we conduct the aspect-based sentiment analysis with BERT to explore the image of China. We discover an overall sentiment change from non-negative to negative in the general public, and explain it with the increasing mentions of negative ideology-related aspects and decreasing mentions of non-negative fact-based aspects. Further investigations into different groups of Twitter users, including U.S. Congress members, English media, and social bots, reveal different patterns in their attitudes toward China. This study provides a deeper understanding of the changing image of China in COVID-19 pandemic. Our research also demonstrates how aspect-based sentiment analysis can be applied in social science researches to deliver valuable insights.

Access Paper or Ask Questions

Various Approaches to Aspect-based Sentiment Analysis

May 05, 2018
Amlaan Bhoi, Sandeep Joshi

The problem of aspect-based sentiment analysis deals with classifying sentiments (negative, neutral, positive) for a given aspect in a sentence. A traditional sentiment classification task involves treating the entire sentence as a text document and classifying sentiments based on all the words. Let us assume, we have a sentence such as "the acceleration of this car is fast, but the reliability is horrible". This can be a difficult sentence because it has two aspects with conflicting sentiments about the same entity. Considering machine learning techniques (or deep learning), how do we encode the information that we are interested in one aspect and its sentiment but not the other? Let us explore various pre-processing steps, features, and methods used to facilitate in solving this task.

* 3 pages, 1 table 
Access Paper or Ask Questions

Deep Learning for Sentiment Analysis : A Survey

Jan 30, 2018
Lei Zhang, Shuai Wang, Bing Liu

Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.

* 34 pages, 9 figures, 2 tables 
Access Paper or Ask Questions

LEBANONUPRISING: a thorough study of Lebanese tweets

Sep 30, 2020
Reda Khalaf, Mireille Makary

Recent studies showed a huge interest in social networks sentiment analysis. Twitter, which is a microblogging service, can be a great source of information on how the users feel about a certain topic, or what their opinion is regarding a social, economic and even political matter. On October 17, Lebanon witnessed the start of a revolution; the LebanonUprising hashtag became viral on Twitter. A dataset consisting of a 100,0000 tweets was collected between 18 and 21 October. In this paper, we conducted a sentiment analysis study for the tweets in spoken Lebanese Arabic related to the LebanonUprising hashtag using different machine learning algorithms. The dataset was manually annotated to measure the precision and recall metrics and to compare between the different algorithms. Furthermore, the work completed in this paper provides two more contributions. The first is related to building a Lebanese to Modern Standard Arabic mapping dictionary that was used for the preprocessing of the tweets and the second is an attempt to move from sentiment analysis to emotion detection using emojis, and the two emotions we tried to predict were the "sarcastic" and "funny" emotions. We built a training set from the tweets collected in October 2019 and then we used this set to predict sentiments and emotions of the tweets we collected between May and August 2020. The analysis we conducted shows the variation in sentiments, emotions and users between the two datasets. The results we obtained seem satisfactory especially considering that there was no previous or similar work done involving Lebanese Arabic tweets, to our knowledge.

* 9 pages, published at the CMLA 2020 conference 
Access Paper or Ask Questions

Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation

Aug 05, 2021
Zaid Khan, Yun Fu

Multimodal target/aspect sentiment classification combines multimodal sentiment analysis and aspect/target sentiment classification. The goal of the task is to combine vision and language to understand the sentiment towards a target entity in a sentence. Twitter is an ideal setting for the task because it is inherently multimodal, highly emotional, and affects real world events. However, multimodal tweets are short and accompanied by complex, possibly irrelevant images. We introduce a two-stream model that translates images in input space using an object-aware transformer followed by a single-pass non-autoregressive text generation approach. We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model. Our approach increases the amount of text available to the language model and distills the object-level information in complex images. We achieve state-of-the-art performance on two multimodal Twitter datasets without modifying the internals of the language model to accept multimodal data, demonstrating the effectiveness of our translation. In addition, we explain a failure mode of a popular approach for aspect sentiment analysis when applied to tweets. Our code is available at \textcolor{blue}{\url{}}.

* ACM Multimedia 2021 Oral 
Access Paper or Ask Questions

ArSentD-LEV: A Multi-Topic Corpus for Target-based Sentiment Analysis in Arabic Levantine Tweets

May 25, 2019
Ramy Baly, Alaa Khaddaj, Hazem Hajj, Wassim El-Hajj, Khaled Bashir Shaban

Sentiment analysis is a highly subjective and challenging task. Its complexity further increases when applied to the Arabic language, mainly because of the large variety of dialects that are unstandardized and widely used in the Web, especially in social media. While many datasets have been released to train sentiment classifiers in Arabic, most of these datasets contain shallow annotation, only marking the sentiment of the text unit, as a word, a sentence or a document. In this paper, we present the Arabic Sentiment Twitter Dataset for the Levantine dialect (ArSenTD-LEV). Based on findings from analyzing tweets from the Levant region, we created a dataset of 4,000 tweets with the following annotations: the overall sentiment of the tweet, the target to which the sentiment was expressed, how the sentiment was expressed, and the topic of the tweet. Results confirm the importance of these annotations at improving the performance of a baseline sentiment classifier. They also confirm the gap of training in a certain domain, and testing in another domain.

* Corpus development, Levantine tweets, multi-topic, sentiment analysis, sentiment target, LREC-2018, OSACT-2018 
Access Paper or Ask Questions

Analysis of opinionated text for opinion mining

Jul 14, 2016
K Paramesha, K C Ravishankar

In sentiment analysis, the polarities of the opinions expressed on an object/feature are determined to assess the sentiment of a sentence or document whether it is positive/negative/neutral. Naturally, the object/feature is a noun representation which refers to a product or a component of a product, let us say, the "lens" in a camera and opinions emanating on it are captured in adjectives, verbs, adverbs and noun words themselves. Apart from such words, other meta-information and diverse effective features are also going to play an important role in influencing the sentiment polarity and contribute significantly to the performance of the system. In this paper, some of the associated information/meta-data are explored and investigated in the sentiment text. Based on the analysis results presented here, there is scope for further assessment and utilization of the meta-information as features in text categorization, ranking text document, identification of spam documents and polarity classification problems.

* Machine Learning and Applications: An International Journal (MLAIJ) Vol.3, No.2, June 2016 
* Sentiment Analysis, Features, Feature Engineering, Emotions, Word Sense Disambiguation, Sentiment Lexicons, Meta-Information 
Access Paper or Ask Questions

NoReC: The Norwegian Review Corpus

Oct 15, 2017
Erik Velldal, Lilja Øvrelid, Eivind Alexander Bergem, Cathrine Stadsnes, Samia Touileb, Fredrik Jørgensen

This paper presents the Norwegian Review Corpus (NoReC), created for training and evaluating models for document-level sentiment analysis. The full-text reviews have been collected from major Norwegian news sources and cover a range of different domains, including literature, movies, video games, restaurants, music and theater, in addition to product reviews across a range of categories. Each review is labeled with a manually assigned score of 1-6, as provided by the rating of the original author. This first release of the corpus comprises more than 35,000 reviews. It is distributed using the CoNLL-U format, pre-processed using UDPipe, along with a rich set of metadata. The work reported in this paper forms part of the SANT initiative (Sentiment Analysis for Norwegian Text), a project seeking to provide resources and tools for sentiment analysis and opinion mining for Norwegian. As resources for sentiment analysis have so far been unavailable for Norwegian, NoReC represents a highly valuable and sought-after addition to Norwegian language technology.

* Pending (non-anonymous) review for LREC 2018 
Access Paper or Ask Questions

Bambara Language Dataset for Sentiment Analysis

Aug 05, 2021
Mountaga Diallo, Chayma Fourati, Hatem Haddad

For easier communication, posting, or commenting on each others posts, people use their dialects. In Africa, various languages and dialects exist. However, they are still underrepresented and not fully exploited for analytical studies and research purposes. In order to perform approaches like Machine Learning and Deep Learning, datasets are required. One of the African languages is Bambara, used by citizens in different countries. However, no previous work on datasets for this language was performed for Sentiment Analysis. In this paper, we present the first common-crawl-based Bambara dialectal dataset dedicated for Sentiment Analysis, available freely for Natural Language Processing research purposes.

* 2nd Workshop on Practical ML for Developing Countries: Learning Under Limited/low Resource Scenarios, International Conference on Learning Representations, 2021 
Access Paper or Ask Questions

Sentiment Analysis of Cybersecurity Content on Twitter and Reddit

Apr 26, 2022
Bipun Thapa

Sentiment Analysis provides an opportunity to understand the subject(s), especially in the digital age, due to an abundance of public data and effective algorithms. Cybersecurity is a subject where opinions are plentiful and differing in the public domain. This descriptive research analyzed cybersecurity content on Twitter and Reddit to measure its sentiment, positive or negative, or neutral. The data from Twitter and Reddit was amassed via technology-specific APIs during a selected timeframe to create datasets, which were then analyzed individually for their sentiment by VADER, an NLP (Natural Language Processing) algorithm. A random sample of cybersecurity content (ten tweets and posts) was also classified for sentiments by twenty human annotators to evaluate the performance of VADER. Cybersecurity content on Twitter was at least 48% positive, and Reddit was at least 26.5% positive. The positive or neutral content far outweighed negative sentiments across both platforms. When compared to human classification, which was considered the standard or source of truth, VADER produced 60% accuracy for Twitter and 70% for Reddit in assessing the sentiment; in other words, some agreement between algorithm and human classifiers. Overall, the goal was to explore an uninhibited research topic about cybersecurity sentiment

* Content is peer-reviewed and was presented in 3rd International Conference on NLP & Information Retrieval (NLPI 2022) 
Access Paper or Ask Questions