We perform spatio-temporal analysis of public sentiment using geotagged photo collections. We develop a deep learning-based classifier that predicts the emotion conveyed by an image. This allows us to associate sentiment with place. We perform spatial hotspot detection and show that different emotions have distinct spatial distributions that match expectations. We also perform temporal analysis using the capture time of the photos. Our spatio-temporal hotspot detection correctly identifies emerging concentrations of specific emotions and year-by-year analyses of select locations show there are strong temporal correlations between the predicted emotions and known events.
This paper provides a method to classify sentiment with robust model based ensemble methods. We preprocess tweet data to enhance coverage of tokenizer. To reduce domain bias, we first train tweet dataset for pre-trained language model. Besides, each classifier has its strengths and weakness, we leverage different types of models with ensemble methods: average and power weighted sum. From the experiments, we show that our approach has achieved positive effect for sentiment classification. Our system reached third place among 26 teams from the evaluation in SocialNLP 2020 EmotionGIF competition.
The large amount of data available in social media, forums and websites motivates researches in several areas of Natural Language Processing, such as sentiment analysis. The popularity of the area due to its subjective and semantic characteristics motivates research on novel methods and approaches for classification. Hence, there is a high demand for datasets on different domains and different languages. This paper introduces TweetSentBR, a sentiment corpora for Brazilian Portuguese manually annotated with 15.000 sentences on TV show domain. The sentences were labeled in three classes (positive, neutral and negative) by seven annotators, following literature guidelines for ensuring reliability on the annotation. We also ran baseline experiments on polarity classification using three machine learning methods, reaching 80.99% on F-Measure and 82.06% on accuracy in binary classification, and 59.85% F-Measure and 64.62% on accuracy on three point classification.
Cross-domain sentiment analysis has received significant attention in recent years, prompted by the need to combat the domain gap between different applications that make use of sentiment analysis. In this paper, we take a novel perspective on this task by exploring the role of external commonsense knowledge. We introduce a new framework, KinGDOM, which utilizes the ConceptNet knowledge graph to enrich the semantics of a document by providing both domain-specific and domain-general background concepts. These concepts are learned by training a graph convolutional autoencoder that leverages inter-domain concepts in a domain-invariant manner. Conditioning a popular domain-adversarial baseline method with these learned concepts helps improve its performance over state-of-the-art approaches, demonstrating the efficacy of our proposed framework.
Sentiment analysis or opinion mining aims to determine attitudes, judgments and opinions of customers for a product or a service. This is a great system to help manufacturers or servicers know the satisfaction level of customers about their products or services. From that, they can have appropriate adjustments. We use a popular machine learning method, being Support Vector Machine, combine with the library in Waikato Environment for Knowledge Analysis (WEKA) to build Java web program which analyzes the sentiment of English comments belongs one in four types of woman products. That are dresses, handbags, shoes and rings. We have developed and test our system with a training set having 300 comments and a test set having 400 comments. The experimental results of the system about precision, recall and F measures for positive comments are 89.3%, 95.0% and 92,.1%; for negative comments are 97.1%, 78.5% and 86.8%; and for neutral comments are 76.7%, 86.2% and 81.2%.
The rapid production of data on the internet and the need to understand how users are feeling from a business and research perspective has prompted the creation of numerous automatic monolingual sentiment detection systems. More recently however, due to the unstructured nature of data on social media, we are observing more instances of multilingual and code-mixed texts. This development in content type has created a new demand for code-mixed sentiment analysis systems. In this study we collect, label and thus create a dataset of Persian-English code-mixed tweets. We then proceed to introduce a model which uses BERT pretrained embeddings as well as translation models to automatically learn the polarity scores of these Tweets. Our model outperforms the baseline models that use Na\"ive Bayes and Random Forest methods.
This paper provides a detailed description of a new Twitter-based benchmark dataset for Arabic Sentiment Analysis (ASAD), which is launched in a competition3, sponsored by KAUST for awarding 10000 USD, 5000 USD and 2000 USD to the first, second and third place winners, respectively. Compared to other publicly released Arabic datasets, ASAD is a large, high-quality annotated dataset(including 95K tweets), with three-class sentiment labels (positive, negative and neutral). We presents the details of the data collection process and annotation process. In addition, we implement several baseline models for the competition task and report the results as a reference for the participants to the competition.
Sentiment Analysis in Arabic is a challenging task due to the rich morphology of the language. Moreover, the task is further complicated when applied to Twitter data that is known to be highly informal and noisy. In this paper, we develop a hybrid method for sentiment analysis for Arabic tweets for a specific Arabic dialect which is the Saudi Dialect. Several features were engineered and evaluated using a feature backward selection method. Then a hybrid method that combines a corpus-based and lexicon-based method was developed for several classification models (two-way, three-way, four-way). The best F1-score for each of these models was (69.9,61.63,55.07) respectively.
In this paper, we present TwiSent, a sentiment analysis system for Twitter. Based on the topic searched, TwiSent collects tweets pertaining to it and categorizes them into the different polarity classes positive, negative and objective. However, analyzing micro-blog posts have many inherent challenges compared to the other text genres. Through TwiSent, we address the problems of 1) Spams pertaining to sentiment analysis in Twitter, 2) Structural anomalies in the text in the form of incorrect spellings, nonstandard abbreviations, slangs etc., 3) Entity specificity in the context of the topic searched and 4) Pragmatics embedded in text. The system performance is evaluated on manually annotated gold standard data and on an automatically annotated tweet set based on hashtags. It is a common practise to show the efficacy of a supervised system on an automatically annotated dataset. However, we show that such a system achieves lesser classification accurcy when tested on generic twitter dataset. We also show that our system performs much better than an existing system.