Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Sentiment Analysis": models, code, and papers

A novel approach to sentiment analysis in Persian using discourse and external semantic information

Jul 18, 2020
Rahim Dehkharghani, Hojjat Emami

Sentiment analysis attempts to identify, extract and quantify affective states and subjective information from various types of data such as text, audio, and video. Many approaches have been proposed to extract the sentiment of individuals from documents written in natural languages in recent years. The majority of these approaches have focused on English, while resource-lean languages such as Persian suffer from the lack of research work and language resources. Due to this gap in Persian, the current work is accomplished to introduce new methods for sentiment analysis which have been applied on Persian. The proposed approach in this paper is two-fold: The first one is based on classifier combination, and the second one is based on deep neural networks which benefits from word embedding vectors. Both approaches takes advantage of local discourse information and external knowledge bases, and also cover several language issues such as negation and intensification, andaddresses different granularity levels, namely word, aspect, sentence, phrase and document-levels. To evaluate the performance of the proposed approach, a Persian dataset is collected from Persian hotel reviews referred as hotel reviews. The proposed approach has been compared to counterpart methods based on the benchmark dataset. The experimental results approve the effectiveness of the proposed approach when compared to related works.

* 15 pages 
Access Paper or Ask Questions

A Convolutional Neural Network for Aspect Sentiment Classification

Jul 04, 2018
Yongping Xing, Chuangbai Xiao, Yifei Wu, Ziming Ding

With the development of the Internet, natural language processing (NLP), in which sentiment analysis is an important task, became vital in information processing.Sentiment analysis includes aspect sentiment classification. Aspect sentiment can provide complete and in-depth results with increased attention on aspect-level. Different context words in a sentence influence the sentiment polarity of a sentence variably, and polarity varies based on the different aspects in a sentence. Take the sentence, 'I bought a new camera. The picture quality is amazing but the battery life is too short.'as an example. If the aspect is picture quality, then the expected sentiment polarity is 'positive', if the battery life aspect is considered, then the sentiment polarity should be 'negative'; therefore, aspect is important to consider when we explore aspect sentiment in the sentence. Recurrent neural network (RNN) is regarded as a good model to deal with natural language processing, and RNNs has get good performance on aspect sentiment classification including Target-Dependent LSTM (TD-LSTM) ,Target-Connection LSTM (TC-LSTM) (Tang, 2015a, b), AE-LSTM, AT-LSTM, AEAT-LSTM (Wang et al., 2016).There are also extensive literatures on sentiment classification utilizing convolutional neural network, but there is little literature on aspect sentiment classification using convolutional neural network. In our paper, we develop attention-based input layers in which aspect information is considered by input layer. We then incorporate attention-based input layers into convolutional neural network (CNN) to introduce context words information. In our experiment, incorporating aspect information into CNN improves the latter's aspect sentiment classification performance without using syntactic parser or external sentiment lexicons in a benchmark dataset from Twitter but get better performance compared with other models.

Access Paper or Ask Questions

IUST at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text using Deep Neural Networks and Linear Baselines

Jul 24, 2020
Soroush Javdan, Taha Shangipour ataei, Behrouz Minaei-Bidgoli

Sentiment Analysis is a well-studied field of Natural Language Processing. However, the rapid growth of social media and noisy content within them poses significant challenges in addressing this problem with well-established methods and tools. One of these challenges is code-mixing, which means using different languages to convey thoughts in social media texts. Our group, with the name of IUST(username: TAHA), participated at the SemEval-2020 shared task 9 on Sentiment Analysis for Code-Mixed Social Media Text, and we have attempted to develop a system to predict the sentiment of a given code-mixed tweet. We used different preprocessing techniques and proposed to use different methods that vary from NBSVM to more complicated deep neural network models. Our best performing method obtains an F1 score of 0.751 for the Spanish-English sub-task and 0.706 over the Hindi-English sub-task.

Access Paper or Ask Questions

MultiBooked: A Corpus of Basque and Catalan Hotel Reviews Annotated for Aspect-level Sentiment Classification

Mar 22, 2018
Jeremy Barnes, Patrik Lambert, Toni Badia

While sentiment analysis has become an established field in the NLP community, research into languages other than English has been hindered by the lack of resources. Although much research in multi-lingual and cross-lingual sentiment analysis has focused on unsupervised or semi-supervised approaches, these still require a large number of resources and do not reach the performance of supervised approaches. With this in mind, we introduce two datasets for supervised aspect-level sentiment analysis in Basque and Catalan, both of which are under-resourced languages. We provide high-quality annotations and benchmarks with the hope that they will be useful to the growing community of researchers working on these languages.

* Accepted at LREC 2018 
Access Paper or Ask Questions

Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis

Jan 13, 2022
Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Hua Jin, Dacheng Tao

Aspect-based sentiment analysis (ABSA) is a fine-grained task of sentiment analysis. To better comprehend long complicated sentences and obtain accurate aspect-specific information, linguistic and commonsense knowledge are generally required in this task. However, most methods employ complicated and inefficient approaches to incorporate external knowledge, e.g., directly searching the graph nodes. Additionally, the complementarity between external knowledge and linguistic information has not been thoroughly studied. To this end, we propose a knowledge graph augmented network (KGAN), which aims to effectively incorporate external knowledge with explicitly syntactic and contextual information. In particular, KGAN captures the sentiment feature representations from multiple different perspectives, i.e., context-, syntax- and knowledge-based. First, KGAN learns the contextual and syntactic representations in parallel to fully extract the semantic features. Then, KGAN integrates the knowledge graphs into the embedding space, based on which the aspect-specific knowledge representations are further obtained via an attention mechanism. Last, we propose a hierarchical fusion module to complement these multiview representations in a local-to-global manner. Extensive experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN. Notably, with the help of the pretrained model of RoBERTa, KGAN achieves a new record of state-of-the-art performance.

* Under review 
Access Paper or Ask Questions

Politics and Virality in the Time of Twitter: A Large-Scale Cross-Party Sentiment Analysis in Greece, Spain and United Kingdom

Feb 01, 2022
Dimosthenis Antypas, Alun Preece, Jose Camacho Collados

Social media has become extremely influential when it comes to policy making in modern societies especially in the western world (e.g., 48% of Europeans use social media every day or almost every day). Platforms such as Twitter allow users to follow politicians, thus making citizens more involved in political discussion. In the same vein, politicians use Twitter to express their opinions, debate among others on current topics and promote their political agenda aiming to influence voter behaviour. Previous studies have shown that tweets conveying negative sentiment are likely to be retweeted more frequently. In this paper, we attempt to analyse tweets from politicians from different countries and explore if their tweets follow the same trend. Utilising state-of-the-art pre-trained language models we performed sentiment analysis on multilingual tweets collected from members of parliament of Greece, Spain and United Kingdom, including devolved administrations. We achieved this by systematically exploring and analysing the differences between influential and less popular tweets. Our analysis indicates that politicians' negatively charged tweets spread more widely, especially in more recent times, and highlights interesting trends in the intersection of sentiment and popularity.

* 12 pages, 5 figures, for code and data used see 
Access Paper or Ask Questions

MuSe 2020 -- The First International Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop

Apr 30, 2020
Lukas Stappen, Alice Baird, Georgios Rizos, Panagiotis Tzirakis, Xinchen Du, Felix Hafner, Lea Schumann, Adria Mallol-Ragolta, Björn W. Schuller, Iulia Lefter, Erik Cambria, Ioannis Kompatsiaris

Multimodal Sentiment Analysis in Real-life Media (MuSe) 2020 is a Challenge-based Workshop focusing on the tasks of sentiment recognition, as well as emotion-target engagement and trustworthiness detection by means of more comprehensively integrating the audio-visual and language modalities. The purpose of MuSe 2020 is to bring together communities from different disciplines; mainly, the audio-visual emotion recognition community (signal-based), and the sentiment analysis community (symbol-based). We present three distinct sub-challenges: MuSe-Wild, which focuses on continuous emotion (arousal and valence) prediction; MuSe-Topic, in which participants recognise domain-specific topics as the target of 3-class (low, medium, high) emotions; and MuSe-Trust, in which the novel aspect of trustworthiness is to be predicted. In this paper, we provide detailed information on MuSe-CaR, the first of its kind in-the-wild database, which is utilised for the challenge, as well as the state-of-the-art features and modelling approaches applied. For each sub-challenge, a competitive baseline for participants is set; namely, on test we report for MuSe-Wild a combined (valence and arousal) CCC of .2568, for MuSe-Topic a score (computed as 0.34$\cdot$ UAR + 0.66$\cdot$F1) of 76.78 % on the 10-class topic and 40.64 % on the 3-class emotion prediction, and for MuSe-Trust a CCC of .4359.

* Baseline Paper MuSe 2020, MuSe Workshop Challenge, ACM Multimedia 
Access Paper or Ask Questions

Classification Benchmarks for Under-resourced Bengali Language based on Multichannel Convolutional-LSTM Network

Apr 19, 2020
Md. Rezaul Karim, Bharathi Raja Chakravarthi, John P. McCrae, Michael Cochez

Exponential growths of social media and micro-blogging sites not only provide platforms for empowering freedom of expressions and individual voices but also enables people to express anti-social behaviour like online harassment, cyberbullying, and hate speech. Numerous works have been proposed to utilize these data for social and anti-social behaviours analysis, document characterization, and sentiment analysis by predicting the contexts mostly for highly resourced languages such as English. However, there are languages that are under-resources, e.g., South Asian languages like Bengali, Tamil, Assamese, Telugu that lack of computational resources for the NLP tasks. In this paper, we provide several classification benchmarks for Bengali, an under-resourced language. We prepared three datasets of expressing hate, commonly used topics, and opinions for hate speech detection, document classification, and sentiment analysis, respectively. We built the largest Bengali word embedding models to date based on 250 million articles, which we call BengFastText. We perform three different experiments, covering document classification, sentiment analysis, and hate speech detection. We incorporate word embeddings into a Multichannel Convolutional-LSTM (MConv-LSTM) network for predicting different types of hate speech, document classification, and sentiment analysis. Experiments demonstrate that BengFastText can capture the semantics of words from respective contexts correctly. Evaluations against several baseline embedding models, e.g., Word2Vec and GloVe yield up to 92.30%, 82.25%, and 90.45% F1-scores in case of document classification, sentiment analysis, and hate speech detection, respectively during 5-fold cross-validation tests.

* This paper is under review in the Journal of Natural Language Engineering 
Access Paper or Ask Questions

What Emotions Make One or Five Stars? Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI

Feb 29, 2020
Chaehan So

When people buy products online, they primarily base their decisions on the recommendations of others given in online reviews. The current work analyzed these online reviews by sentiment analysis and used the extracted sentiments as features to predict the product ratings by several machine learning algorithms. These predictions were disentangled by various meth-ods of explainable AI (XAI) to understand whether the model showed any bias during prediction. Study 1 benchmarked these algorithms (knn, support vector machines, random forests, gradient boosting machines, XGBoost) and identified random forests and XGBoost as best algorithms for predicting the product ratings. In Study 2, the analysis of global feature importance identified the sentiment joy and the emotional valence negative as most predictive features. Two XAI visualization methods, local feature attributions and partial dependency plots, revealed several incorrect prediction mechanisms on the instance-level. Performing the benchmarking as classification, Study 3 identified a high no-information rate of 64.4% that indicated high class imbalance as underlying reason for the identified problems. In conclusion, good performance by machine learning algorithms must be taken with caution because the dataset, as encountered in this work, could be biased towards certain predictions. This work demonstrates how XAI methods reveal such prediction bias.

* To be published in: Lecture Notes in Artificial Intelligence, 1st International Conference on Artificial Intelligence in HCI, AI-HCI, Held as Part of HCI International 2020, Kopenhagen, Denmark, July 19-24, Springer 
Access Paper or Ask Questions