Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Sentiment Analysis": models, code, and papers

Phonetic-enriched Text Representation for Chinese Sentiment Analysis with Reinforcement Learning

Jan 23, 2019
Haiyun Peng, Yukun Ma, Soujanya Poria, Yang Li, Erik Cambria

The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. We are the first to argue that these two important properties can play a major role in Chinese sentiment analysis. Particularly, we propose two effective features to encode phonetic information. Next, we develop a Disambiguate Intonation for Sentiment Analysis (DISA) network using a reinforcement network. It functions as disambiguating intonations for each Chinese character (pinyin). Thus, a precise phonetic representation of Chinese is learned. Furthermore, we also fuse phonetic features with textual and visual features in order to mimic the way humans read and understand Chinese text. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and outshines the state-of-the-art Chinese character level representations.


Structured Self-Attention Weights Encode Semantics in Sentiment Analysis

Oct 10, 2020
Zhengxuan Wu, Thanh-Son Nguyen, Desmond C. Ong

Neural attention, especially the self-attention made popular by the Transformer, has become the workhorse of state-of-the-art natural language processing (NLP) models. Very recent work suggests that the self-attention in the Transformer encodes syntactic information; Here, we show that self-attention scores encode semantics by considering sentiment analysis tasks. In contrast to gradient-based feature attribution methods, we propose a simple and effective Layer-wise Attention Tracing (LAT) method to analyze structured attention weights. We apply our method to Transformer models trained on two tasks that have surface dissimilarities, but share common semantics---sentiment analysis of movie reviews and time-series valence prediction in life story narratives. Across both tasks, words with high aggregated attention weights were rich in emotional semantics, as quantitatively validated by an emotion lexicon labeled by human annotators. Our results show that structured attention weights encode rich semantics in sentiment analysis, and match human interpretations of semantics.

* 10 pages 

LABR: A Large Scale Arabic Sentiment Analysis Benchmark

May 03, 2015
Mahmoud Nabil, Mohamed Aly, Amir Atiya

We introduce LABR, the largest sentiment analysis dataset to-date for the Arabic language. It consists of over 63,000 book reviews, each rated on a scale of 1 to 5 stars. We investigate the properties of the dataset, and present its statistics. We explore using the dataset for two tasks: (1) sentiment polarity classification; and (2) ratings classification. Moreover, we provide standard splits of the dataset into training, validation and testing, for both polarity and ratings classification, in both balanced and unbalanced settings. We extend our previous work by performing a comprehensive analysis on the dataset. In particular, we perform an extended survey of the different classifiers typically used for the sentiment polarity classification problem. We also construct a sentiment lexicon from the dataset that contains both single and compound sentiment words and we explore its effectiveness. We make the dataset and experimental details publicly available.

* 10 pages 

YouTube Ad View Sentiment Analysis using Deep Learning and Machine Learning

May 23, 2022
Tanvi Mehta, Ganesh Deshmukh

Sentiment Analysis is currently a vital area of research. With the advancement in the use of the internet, the creation of social media, websites, blogs, opinions, ratings, etc. has increased rapidly. People express their feedback and emotions on social media posts in the form of likes, dislikes, comments, etc. The rapid growth in the volume of viewer-generated or user-generated data or content on YouTube has led to an increase in YouTube sentiment analysis. Due to this, analyzing the public reactions has become an essential need for information extraction and data visualization in the technical domain. This research predicts YouTube Ad view sentiments using Deep Learning and Machine Learning algorithms like Linear Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Artificial Neural Network (ANN). Finally, a comparative analysis is done based on experimental results acquired from different models.

* International Journal of Computer Applications 184(11):10-14, May 2022 
* 5 pages, 9 figures, Published with International Journal of Computer Applications (IJCA) 

Exploiting Effective Representations for Chinese Sentiment Analysis Using a Multi-Channel Convolutional Neural Network

Aug 08, 2018
Pengfei Liu, Ji Zhang, Cane Wing-Ki Leung, Chao He, Thomas L. Griffiths

Effective representation of a text is critical for various natural language processing tasks. For the particular task of Chinese sentiment analysis, it is important to understand and choose an effective representation of a text from different forms of Chinese representations such as word, character and pinyin. This paper presents a systematic study of the effect of these representations for Chinese sentiment analysis by proposing a multi-channel convolutional neural network (MCCNN), where each channel corresponds to a representation. Experimental results show that: (1) Word wins on the dataset of low OOV rate while character wins otherwise; (2) Using these representations in combination generally improves the performance; (3) The representations based on MCCNN outperform conventional ngram features using SVM; (4) The proposed MCCNN model achieves the competitive performance against the state-of-the-art model fastText for Chinese sentiment analysis.


SlangSD: Building and Using a Sentiment Dictionary of Slang Words for Short-Text Sentiment Classification

Aug 17, 2016
Liang Wu, Fred Morstatter, Huan Liu

Sentiment in social media is increasingly considered as an important resource for customer segmentation, market understanding, and tackling other socio-economic issues. However, sentiment in social media is difficult to measure since user-generated content is usually short and informal. Although many traditional sentiment analysis methods have been proposed, identifying slang sentiment words remains untackled. One of the reasons is that slang sentiment words are not available in existing dictionaries or sentiment lexicons. To this end, we propose to build the first sentiment dictionary of slang words to aid sentiment analysis of social media content. It is laborious and time-consuming to collect and label the sentiment polarity of a comprehensive list of slang words. We present an approach to leverage web resources to construct an extensive Slang Sentiment word Dictionary (SlangSD) that is easy to maintain and extend. SlangSD is publicly available for research purposes. We empirically show the advantages of using SlangSD, the newly-built slang sentiment word dictionary for sentiment classification, and provide examples demonstrating its ease of use with an existing sentiment system.

* 15 pages, 2 figures 

Gated Mechanism for Attention Based Multimodal Sentiment Analysis

Feb 21, 2020
Ayush Kumar, Jithendra Vepa

Multimodal sentiment analysis has recently gained popularity because of its relevance to social media posts, customer service calls and video blogs. In this paper, we address three aspects of multimodal sentiment analysis; 1. Cross modal interaction learning, i.e. how multiple modalities contribute to the sentiment, 2. Learning long-term dependencies in multimodal interactions and 3. Fusion of unimodal and cross modal cues. Out of these three, we find that learning cross modal interactions is beneficial for this problem. We perform experiments on two benchmark datasets, CMU Multimodal Opinion level Sentiment Intensity (CMU-MOSI) and CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI) corpus. Our approach on both these tasks yields accuracies of 83.9% and 81.1% respectively, which is 1.6% and 1.34% absolute improvement over current state-of-the-art.

* Accepted to appear in ICASSP 2020 

BiSyn-GAT+: Bi-Syntax Aware Graph Attention Network for Aspect-based Sentiment Analysis

Apr 06, 2022
Shuo Liang, Wei Wei, Xian-Ling Mao, Fei Wang, Zhiyong He

Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task that aims to align aspects and corresponding sentiments for aspect-specific sentiment polarity inference. It is challenging because a sentence may contain multiple aspects or complicated (e.g., conditional, coordinating, or adversative) relations. Recently, exploiting dependency syntax information with graph neural networks has been the most popular trend. Despite its success, methods that heavily rely on the dependency tree pose challenges in accurately modeling the alignment of the aspects and their words indicative of sentiment, since the dependency tree may provide noisy signals of unrelated associations (e.g., the "conj" relation between "great" and "dreadful" in Figure 2). In this paper, to alleviate this problem, we propose a Bi-Syntax aware Graph Attention Network (BiSyn-GAT+). Specifically, BiSyn-GAT+ fully exploits the syntax information (e.g., phrase segmentation and hierarchical structure) of the constituent tree of a sentence to model the sentiment-aware context of every single aspect (called intra-context) and the sentiment relations across aspects (called inter-context) for learning. Experiments on four benchmark datasets demonstrate that BiSyn-GAT+ outperforms the state-of-the-art methods consistently.


Identifying negativity factors from social media text corpus using sentiment analysis method

Jul 06, 2021
Mohammad Aimal, Maheen Bakhtyar, Junaid Baber, Sadia Lakho, Umar Mohammad, Warda Ahmed, Jahanvash Karim

Automatic sentiment analysis play vital role in decision making. Many organizations spend a lot of budget to understand their customer satisfaction by manually going over their feedback/comments or tweets. Automatic sentiment analysis can give overall picture of the comments received against any event, product, or activity. Usually, the comments/tweets are classified into two main classes that are negative or positive. However, the negative comments are too abstract to understand the basic reason or the context. organizations are interested to identify the exact reason for the negativity. In this research study, we hierarchically goes down into negative comments, and link them with more classes. Tweets are extracted from social media sites such as Twitter and Facebook. If the sentiment analysis classifies any tweet into negative class, then we further try to associates that negative comments with more possible negative classes. Based on expert opinions, the negative comments/tweets are further classified into 8 classes. Different machine learning algorithms are evaluated and their accuracy are reported.

* SOFA 2020 
* Paper is accepted in the SOFA 2020 conference 

Empirical Evaluation of Leveraging Named Entities for Arabic Sentiment Analysis

Apr 23, 2019
Hala Mulki, Hatem Haddad, Mourad Gridach, Ismail Babaoglu

Social media reflects the public attitudes towards specific events. Events are often related to persons, locations or organizations, the so-called Named Entities. This can define Named Entities as sentiment-bearing components. In this paper, we dive beyond Named Entities recognition to the exploitation of sentiment-annotated Named Entities in Arabic sentiment analysis. Therefore, we develop an algorithm to detect the sentiment of Named Entities based on the majority of attitudes towards them. This enabled tagging Named Entities with proper tags and, thus, including them in a sentiment analysis framework of two models: supervised and lexicon-based. Both models were applied on datasets of multi-dialectal content. The results revealed that Named Entities have no considerable impact on the supervised model, while employing them in the lexicon-based model improved the classification performance and outperformed most of the baseline systems.

* 7 pages, 5 figures, 7 tables