Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Sentiment Analysis": models, code, and papers

KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis

May 02, 2020
Deepanway Ghosal, Devamanyu Hazarika, Navonil Majumder, Abhinaba Roy, Soujanya Poria, Rada Mihalcea

Cross-domain sentiment analysis has received significant attention in recent years, prompted by the need to combat the domain gap between different applications that make use of sentiment analysis. In this paper, we take a novel perspective on this task by exploring the role of external commonsense knowledge. We introduce a new framework, KinGDOM, which utilizes the ConceptNet knowledge graph to enrich the semantics of a document by providing both domain-specific and domain-general background concepts. These concepts are learned by training a graph convolutional autoencoder that leverages inter-domain concepts in a domain-invariant manner. Conditioning a popular domain-adversarial baseline method with these learned concepts helps improve its performance over state-of-the-art approaches, demonstrating the efficacy of our proposed framework.

Access Paper or Ask Questions

Machine Learning based English Sentiment Analysis

May 16, 2019
T. N. T. Tran, L. K. N. Nguyen, V. M. Ngo

Sentiment analysis or opinion mining aims to determine attitudes, judgments and opinions of customers for a product or a service. This is a great system to help manufacturers or servicers know the satisfaction level of customers about their products or services. From that, they can have appropriate adjustments. We use a popular machine learning method, being Support Vector Machine, combine with the library in Waikato Environment for Knowledge Analysis (WEKA) to build Java web program which analyzes the sentiment of English comments belongs one in four types of woman products. That are dresses, handbags, shoes and rings. We have developed and test our system with a training set having 300 comments and a test set having 400 comments. The experimental results of the system about precision, recall and F measures for positive comments are 89.3%, 95.0% and 92,.1%; for negative comments are 97.1%, 78.5% and 86.8%; and for neutral comments are 76.7%, 86.2% and 81.2%.

* Journal of Science and Technology, Vietnam Academy of Science and Technology, Vol. 52, No. 4D, pp. 142-155 (2014) 
* 6 pages, in Vietnamese 
Access Paper or Ask Questions

Sentiment Analysis of Arabic Tweets: Feature Engineering and A Hybrid Approach

May 22, 2018
Nora Al-Twairesh, Hend Al-Khalifa, AbdulMalik Alsalman, Yousef Al-Ohali

Sentiment Analysis in Arabic is a challenging task due to the rich morphology of the language. Moreover, the task is further complicated when applied to Twitter data that is known to be highly informal and noisy. In this paper, we develop a hybrid method for sentiment analysis for Arabic tweets for a specific Arabic dialect which is the Saudi Dialect. Several features were engineered and evaluated using a feature backward selection method. Then a hybrid method that combines a corpus-based and lexicon-based method was developed for several classification models (two-way, three-way, four-way). The best F1-score for each of these models was (69.9,61.63,55.07) respectively.

Access Paper or Ask Questions

Cross-Lingual Sentiment Analysis Without (Good) Translation

Oct 24, 2017
Mohamed Abdalla, Graeme Hirst

Current approaches to cross-lingual sentiment analysis try to leverage the wealth of labeled English data using bilingual lexicons, bilingual vector space embeddings, or machine translation systems. Here we show that it is possible to use a single linear transformation, with as few as 2000 word pairs, to capture fine-grained sentiment relationships between words in a cross-lingual setting. We apply these cross-lingual sentiment models to a diverse set of tasks to demonstrate their functionality in a non-English context. By effectively leveraging English sentiment knowledge without the need for accurate translation, we can analyze and extract features from other languages with scarce data at a very low cost, thus making sentiment and related analyses for many languages inexpensive.

* 10 pages, 4 figures 
Access Paper or Ask Questions

SemEval-2013 Task 2: Sentiment Analysis in Twitter

Dec 14, 2019
Preslav Nakov, Zornitsa Kozareva, Alan Ritter, Sara Rosenthal, Veselin Stoyanov, Theresa Wilson

In recent years, sentiment analysis in social media has attracted a lot of research interest and has been used for a number of applications. Unfortunately, research has been hindered by the lack of suitable datasets, complicating the comparison between approaches. To address this issue, we have proposed SemEval-2013 Task 2: Sentiment Analysis in Twitter, which included two subtasks: A, an expression-level subtask, and B, a message-level subtask. We used crowdsourcing on Amazon Mechanical Turk to label a large Twitter training dataset along with additional test sets of Twitter and SMS messages for both subtasks. All datasets used in the evaluation are released to the research community. The task attracted significant interest and a total of 149 submissions from 44 teams. The best-performing team achieved an F1 of 88.9% and 69% for subtasks A and B, respectively.

* SemEval-2013 
* Sentiment analysis, microblog sentiment analysis, Twitter opinion mining, SMS 
Access Paper or Ask Questions

Overcoming Language Variation in Sentiment Analysis with Social Attention

Aug 26, 2017
Yi Yang, Jacob Eisenstein

Variation in language is ubiquitous, particularly in newer forms of writing such as social media. Fortunately, variation is not random, it is often linked to social properties of the author. In this paper, we show how to exploit social networks to make sentiment analysis more robust to social language variation. The key idea is linguistic homophily: the tendency of socially linked individuals to use language in similar ways. We formalize this idea in a novel attention-based neural network architecture, in which attention is divided among several basis models, depending on the author's position in the social network. This has the effect of smoothing the classification function across the social network, and makes it possible to induce personalized classifiers even for authors for whom there is no labeled data or demographic metadata. This model significantly improves the accuracies of sentiment analysis on Twitter and on review data.

* Published in Transactions of the Association for Computational Linguistics (TACL), 2017. Please cite the TACL version: 
Access Paper or Ask Questions

Method for Aspect-Based Sentiment Annotation Using Rhetorical Analysis

Sep 13, 2017
Łukasz Augustyniak, Krzysztof Rajda, Tomasz Kajdanowicz

This paper fills a gap in aspect-based sentiment analysis and aims to present a new method for preparing and analysing texts concerning opinion and generating user-friendly descriptive reports in natural language. We present a comprehensive set of techniques derived from Rhetorical Structure Theory and sentiment analysis to extract aspects from textual opinions and then build an abstractive summary of a set of opinions. Moreover, we propose aspect-aspect graphs to evaluate the importance of aspects and to filter out unimportant ones from the summary. Additionally, the paper presents a prototype solution of data flow with interesting and valuable results. The proposed method's results proved the high accuracy of aspect detection when applied to the gold standard dataset.

Access Paper or Ask Questions

SentiQ: A Probabilistic Logic Approach to Enhance Sentiment Analysis Tool Quality

Aug 19, 2020
Wissam Maamar Kouadri, Salima Benbernou, Mourad Ouziri, Themis Palpanas, Iheb Ben Amor

The opinion expressed in various Web sites and social-media is an essential contributor to the decision making process of several organizations. Existing sentiment analysis tools aim to extract the polarity (i.e., positive, negative, neutral) from these opinionated contents. Despite the advance of the research in the field, sentiment analysis tools give \textit{inconsistent} polarities, which is harmful to business decisions. In this paper, we propose SentiQ, an unsupervised Markov logic Network-based approach that injects the semantic dimension in the tools through rules. It allows to detect and solve inconsistencies and then improves the overall accuracy of the tools. Preliminary experimental results demonstrate the usefulness of SentiQ.

* In Proceedings of the 9th KDD Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM 20). San Diego, CA, USA, 8 pages 
Access Paper or Ask Questions

L3CubeMahaSent: A Marathi Tweet-based Sentiment Analysis Dataset

Mar 21, 2021
Atharva Kulkarni, Meet Mandhane, Manali Likhitkar, Gayatri Kshirsagar, Raviraj Joshi

Sentiment analysis is one of the most fundamental tasks in Natural Language Processing. Popular languages like English, Arabic, Russian, Mandarin, and also Indian languages such as Hindi, Bengali, Tamil have seen a significant amount of work in this area. However, the Marathi language which is the third most popular language in India still lags behind due to the absence of proper datasets. In this paper, we present the first major publicly available Marathi Sentiment Analysis Dataset - L3CubeMahaSent. It is curated using tweets extracted from various Maharashtrian personalities' Twitter accounts. Our dataset consists of ~16,000 distinct tweets classified in three broad classes viz. positive, negative, and neutral. We also present the guidelines using which we annotated the tweets. Finally, we present the statistics of our dataset and baseline classification results using CNN, LSTM, ULMFiT, and BERT-based deep learning models.

* Accepted at [email protected] 2021 
Access Paper or Ask Questions

FinBERT: Financial Sentiment Analysis with Pre-trained Language Models

Aug 27, 2019
Dogu Araci

Financial sentiment analysis is a challenging task due to the specialized language and lack of labeled data in that domain. General-purpose models are not effective enough because of the specialized language used in a financial context. We hypothesize that pre-trained language models can help with this problem because they require fewer labeled examples and they can be further trained on domain-specific corpora. We introduce FinBERT, a language model based on BERT, to tackle NLP tasks in the financial domain. Our results show improvement in every measured metric on current state-of-the-art results for two financial sentiment analysis datasets. We find that even with a smaller training set and fine-tuning only a part of the model, FinBERT outperforms state-of-the-art machine learning methods.

* This thesis is submitted in partial fulfillment for the degree of Master of Science in Information Studies: Data Science, University of Amsterdam. June 25, 2019 
Access Paper or Ask Questions