Sentiment analysis is a task of natural language processing which has recently attracted increasing attention. However, sentiment analysis research has mainly been carried out for the English language. Although Arabic is ramping up as one of the most used languages on the Internet, only a few studies have focused on Arabic sentiment analysis so far. In this paper, we carry out an in-depth qualitative study of the most important research works in this context by presenting limits and strengths of existing approaches. In particular, we survey both approaches that leverage machine translation or transfer learning to adapt English resources to Arabic and approaches that stem directly from the Arabic language.
One of the main difficulties in sentiment analysis of the Arabic language is the presence of the colloquialism. In this paper, we examine the effect of using objective words in conjunction with sentimental words on sentiment classification for the colloquial Arabic reviews, specifically Jordanian colloquial reviews. The reviews often include both sentimental and objective words, however, the most existing sentiment analysis models ignore the objective words as they are considered useless. In this work, we created two lexicons: the first includes the colloquial sentimental words and compound phrases, while the other contains the objective words associated with values of sentiment tendency based on a particular estimation method. We used these lexicons to extract sentiment features that would be training input to the Support Vector Machines (SVM) to classify the sentiment polarity of the reviews. The reviews dataset have been collected manually from JEERAN website. The results of the experiments show that the proposed approach improves the polarity classification in comparison to two baseline models, with accuracy 95.6%.
Today's business ecosystem has become very competitive. Customer satisfaction has become a major focus for business growth. Business organizations are spending a lot of money and human resources on various strategies to understand and fulfill their customer's needs. But, because of defective manual analysis on multifarious needs of customers, many organizations are failing to achieve customer satisfaction. As a result, they are losing customer's loyalty and spending extra money on marketing. We can solve the problems by implementing Sentiment Analysis. It is a combined technique of Natural Language Processing (NLP) and Machine Learning (ML). Sentiment Analysis is broadly used to extract insights from wider public opinion behind certain topics, products, and services. We can do it from any online available data. In this paper, we have introduced two NLP techniques (Bag-of-Words and TF-IDF) and various ML classification algorithms (Support Vector Machine, Logistic Regression, Multinomial Naive Bayes, Random Forest) to find an effective approach for Sentiment Analysis on a large, imbalanced, and multi-classed dataset. Our best approaches provide 77% accuracy using Support Vector Machine and Logistic Regression with Bag-of-Words technique.
Sentiment analysis has become a very important tool for analysis of social media data. There are several methods developed for this research field, many of them working very differently from each other, covering distinct aspects of the problem and disparate strategies. Despite the large number of existent techniques, there is no single one which fits well in all cases or for all data sources. Supervised approaches may be able to adapt to specific situations but they require manually labeled training, which is very cumbersome and expensive to acquire, mainly for a new application. In this context, in here, we propose to combine several very popular and effective state-of-the-practice sentiment analysis methods, by means of an unsupervised bootstrapped strategy for polarity classification. One of our main goals is to reduce the large variability (lack of stability) of the unsupervised methods across different domains (datasets). Our solution was thoroughly tested considering thirteen different datasets in several domains such as opinions, comments, and social media. The experimental results demonstrate that our combined method (aka, 10SENT) improves the effectiveness of the classification task, but more importantly, it solves a key problem in the field. It is consistently among the best methods in many data types, meaning that it can produce the best (or close to best) results in almost all considered contexts, without any additional costs (e.g., manual labeling). Our self-learning approach is also very independent of the base methods, which means that it is highly extensible to incorporate any new additional method that can be envisioned in the future. Finally, we also investigate a transfer learning approach for sentiment analysis as a means to gather additional (unsupervised) information for the proposed approach and we show the potential of this technique to improve our results.
Attention scorers have achieved success in parsing tasks like semantic and syntactic dependency parsing. However, in tasks modeled into parsing, like structured sentiment analysis, "dependency edges" are very sparse which hinders parser performance. Thus we propose a sparse and fuzzy attention scorer with pooling layers which improves parser performance and sets the new state-of-the-art on structured sentiment analysis. We further explore the parsing modeling on structured sentiment analysis with second-order parsing and introduce a novel sparse second-order edge building procedure that leads to significant improvement in parsing performance.
Aspect-based Sentiment analysis (ABSA) accomplishes a fine-grained analysis that defines the aspects of a given document or sentence and the sentiments conveyed regarding each aspect. This level of analysis is the most detailed version that is capable of exploring the nuanced viewpoints of the reviews. Most of the research available in ABSA focuses on English language with very few work available on Arabic. Most previous work in Arabic has been based on regular methods of machine learning that mainly depends on a group of rare resources and tools for analyzing and processing Arabic content such as lexicons, but the lack of those resources presents another challenge. To overcome these obstacles, Deep Learning (DL)-based methods are proposed using two models based on Gated Recurrent Units (GRU) neural networks for ABSA. The first one is a DL model that takes advantage of the representations on both words and characters via the combination of bidirectional GRU, Convolutional neural network (CNN), and Conditional Random Field (CRF) which makes up (BGRU-CNN-CRF) model to extract the main opinionated aspects (OTE). The second is an interactive attention network based on bidirectional GRU (IAN-BGRU) to identify sentiment polarity toward extracted aspects. We evaluated our models using the benchmarked Arabic hotel reviews dataset. The results indicate that the proposed methods are better than baseline research on both tasks having 38.5% enhancement in F1-score for opinion target extraction (T2) and 7.5% in accuracy for aspect-based sentiment polarity classification (T3). Obtaining F1 score of 69.44% for T2, and accuracy of 83.98% for T3.
This paper proposes a new HDP based online review rating regression model named Topic-Sentiment-Preference Regression Analysis (TSPRA). TSPRA combines topics (i.e. product aspects), word sentiment and user preference as regression factors, and is able to perform topic clustering, review rating prediction, sentiment analysis and what we invent as "critical aspect" analysis altogether in one framework. TSPRA extends sentiment approaches by integrating the key concept "user preference" in collaborative filtering (CF) models into consideration, while it is distinct from current CF models by decoupling "user preference" and "sentiment" as independent factors. Our experiments conducted on 22 Amazon datasets show overwhelming better performance in rating predication against a state-of-art model FLAME (2015) in terms of error, Pearson's Correlation and number of inverted pairs. For sentiment analysis, we compare the derived word sentiments against a public sentiment resource SenticNet3 and our sentiment estimations clearly make more sense in the context of online reviews. Last, as a result of the de-correlation of "user preference" from "sentiment", TSPRA is able to evaluate a new concept "critical aspects", defined as the product aspects seriously concerned by users but negatively commented in reviews. Improvement to such "critical aspects" could be most effective to enhance user experience.
Decision making models are constrained by taking the expert evaluations with pre-defined numerical or linguistic terms. We claim that the use of sentiment analysis will allow decision making models to consider expert evaluations in natural language. Accordingly, we propose the Sentiment Analysis based Multi-person Multi-criteria Decision Making (SA-MpMcDM) methodology, which builds the expert evaluations from their natural language reviews, and even from their numerical ratings if they are available. The SA-MpMcDM methodology incorporates an end-to-end multi-task deep learning model for aspect based sentiment analysis, named DMuABSA model, able to identify the aspect categories mentioned in an expert review, and to distill their opinions and criteria. The individual expert evaluations are aggregated via a criteria weighting through the attention of the experts. We evaluate the methodology in a restaurant decision problem, hence we build the TripR-2020 dataset of restaurant reviews, which we manually annotate and release. We analyze the SA-MpMcDM methodology in different scenarios using and not using natural language and numerical evaluations. The analysis shows that the combination of both sources of information results in a higher quality preference vector.