Emotions have often been a crucial part of compelling narratives: literature tells about people with goals, desires, passions, and intentions. In the past, classical literary studies usually scrutinized the affective dimension of literature within the framework of hermeneutics. However, with emergence of the research field known as Digital Humanities (DH) some studies of emotions in literary context have taken a computational turn. Given the fact that DH is still being formed as a science, this direction of research can be rendered relatively new. At the same time, the research in sentiment analysis started in computational linguistic almost two decades ago and is nowadays an established field that has dedicated workshops and tracks in the main computational linguistics conferences. This leads us to the question of what are the commonalities and discrepancies between sentiment analysis research in computational linguistics and digital humanities? In this survey, we offer an overview of the existing body of research on sentiment and emotion analysis as applied to literature. We precede the main part of the survey with a short introduction to natural language processing and machine learning, psychological models of emotions, and provide an overview of existing approaches to sentiment and emotion analysis in computational linguistics. The papers presented in this survey are either coming directly from DH or computational linguistics venues and are limited to sentiment and emotion analysis as applied to literary text.
Several messages express opinions about events, products, and services, political views or even their author's emotional state and mood. Sentiment analysis has been used in several applications including analysis of the repercussions of events in social networks, analysis of opinions about products and services, and simply to better understand aspects of social communication in Online Social Networks (OSNs). There are multiple methods for measuring sentiments, including lexical-based approaches and supervised machine learning methods. Despite the wide use and popularity of some methods, it is unclear which method is better for identifying the polarity (i.e., positive or negative) of a message as the current literature does not provide a method of comparison among existing methods. Such a comparison is crucial for understanding the potential limitations, advantages, and disadvantages of popular methods in analyzing the content of OSNs messages. Our study aims at filling this gap by presenting comparisons of eight popular sentiment analysis methods in terms of coverage (i.e., the fraction of messages whose sentiment is identified) and agreement (i.e., the fraction of identified sentiments that are in tune with ground truth). We develop a new method that combines existing approaches, providing the best coverage results and competitive agreement. We also present a free Web service called iFeel, which provides an open API for accessing and comparing results across different sentiment methods for a given text.
We introduce a new language representation model in finance called Financial Embedding Analysis of Sentiment (FinEAS). In financial markets, news and investor sentiment are significant drivers of security prices. Thus, leveraging the capabilities of modern NLP approaches for financial sentiment analysis is a crucial component in identifying patterns and trends that are useful for market participants and regulators. In recent years, methods that use transfer learning from large Transformer-based language models like BERT, have achieved state-of-the-art results in text classification tasks, including sentiment analysis using labelled datasets. Researchers have quickly adopted these approaches to financial texts, but best practices in this domain are not well-established. In this work, we propose a new model for financial sentiment analysis based on supervised fine-tuned sentence embeddings from a standard BERT model. We demonstrate our approach achieves significant improvements in comparison to vanilla BERT, LSTM, and FinBERT, a financial domain specific BERT.
Social media platforms and online forums generate rapid and increasing amount of textual data. Businesses, government agencies, and media organizations seek to perform sentiment analysis on this rich text data. The results of these analytics are used for adapting marketing strategies, customizing products, security and various other decision makings. Sentiment analysis has been extensively studied and various methods have been developed for it with great success. These methods, however apply to texts written in a specific language. This limits applicability to a limited demographic and a specific geographic region. In this paper we propose a general approach for sentiment analysis on data containing texts from multiple languages. This enables all the applications to utilize the results of sentiment analysis in a language oblivious or language-independent fashion.
The impact of social media on the modern world is difficult to overstate. Virtually all companies and public figures have social media accounts on popular platforms such as Twitter and Facebook. In China, the micro-blogging service provider, Sina Weibo, is the most popular such service. To influence public opinion, Weibo trolls -- the so called Water Army -- can be hired to post deceptive comments. In this paper, we focus on troll detection via sentiment analysis and other user activity data on the Sina Weibo platform. We implement techniques for Chinese sentence segmentation, word embedding, and sentiment score calculation. In recent years, troll detection and sentiment analysis have been studied, but we are not aware of previous research that considers troll detection based on sentiment analysis. We employ the resulting techniques to develop and test a sentiment analysis approach for troll detection, based on a variety of machine learning strategies. Experimental results are generated and analyzed. A Chrome extension is presented that implements our proposed technique, which enables real-time troll detection when a user browses Sina Weibo.
Prior work on sentiment analysis using weak supervision primarily focuses on different reviews such as movies (IMDB), restaurants (Yelp), products (Amazon).~One under-explored field in this regard is customer chat data for a customer-agent chat in customer support due to the lack of availability of free public data. Here, we perform sentiment analysis on customer chat using weak supervision on our in-house dataset. We fine-tune the pre-trained language model (LM) RoBERTa as a sentiment classifier using weak supervision. Our contribution is as follows:1) We show that by using weak sentiment classifiers along with domain-specific lexicon-based rules as Labeling Functions (LF), we can train a fairly accurate customer chat sentiment classifier using weak supervision. 2) We compare the performance of our custom-trained model with off-the-shelf google cloud NLP API for sentiment analysis. We show that by injecting domain-specific knowledge using LFs, even with weak supervision, we can train a model to handle some domain-specific use cases better than off-the-shelf google cloud NLP API. 3) We also present an analysis of how customer sentiment in a chat relates to problem resolution.
Citation sentiment analysis is an important task in scientific paper analysis. Existing machine learning techniques for citation sentiment analysis are focusing on labor-intensive feature engineering, which requires large annotated corpus. As an automatic feature extraction tool, word2vec has been successfully applied to sentiment analysis of short texts. In this work, I conducted empirical research with the question: how well does word2vec work on the sentiment analysis of citations? The proposed method constructed sentence vectors (sent2vec) by averaging the word embeddings, which were learned from Anthology Collections (ACL-Embeddings). I also investigated polarity-specific word embeddings (PS-Embeddings) for classifying positive and negative citations. The sentence vectors formed a feature space, to which the examined citation sentence was mapped to. Those features were input into classifiers (support vector machines) for supervised classification. Using 10-cross-validation scheme, evaluation was conducted on a set of annotated citations. The results showed that word embeddings are effective on classifying positive and negative citations. However, hand-crafted features performed better for the overall classification.
Recent advances in machine learning have led to computer systems that are human-like in behaviour. Sentiment analysis, the automatic determination of emotions in text, is allowing us to capitalize on substantial previously unattainable opportunities in commerce, public health, government policy, social sciences, and art. Further, analysis of emotions in text, from news to social media posts, is improving our understanding of not just how people convey emotions through language but also how emotions shape our behaviour. This article presents a sweeping overview of sentiment analysis research that includes: the origins of the field, the rich landscape of tasks, challenges, a survey of the methods and resources used, and applications. We also discuss discuss how, without careful fore-thought, sentiment analysis has the potential for harmful outcomes. We outline the latest lines of research in pursuit of fairness in sentiment analysis.