Scientific papers are complex and understanding the usefulness of these papers requires prior knowledge. Peer reviews are comments on a paper provided by designated experts on that field and hold a substantial amount of information, not only for the editors and chairs to make the final decision, but also to judge the potential impact of the paper. In this paper, we propose to use aspect-based sentiment analysis of scientific reviews to be able to extract useful information, which correlates well with the accept/reject decision. While working on a dataset of close to 8k reviews from ICLR, one of the top conferences in the field of machine learning, we use an active learning framework to build a training dataset for aspect prediction, which is further used to obtain the aspects and sentiments for the entire dataset. We show that the distribution of aspect-based sentiments obtained from a review is significantly different for accepted and rejected papers. We use the aspect sentiments from these reviews to make an intriguing observation, certain aspects present in a paper and discussed in the review strongly determine the final recommendation. As a second objective, we quantify the extent of disagreement among the reviewers refereeing a paper. We also investigate the extent of disagreement between the reviewers and the chair and find that the inter-reviewer disagreement may have a link to the disagreement with the chair. One of the most interesting observations from this study is that reviews, where the reviewer score and the aspect sentiments extracted from the review text written by the reviewer are consistent, are also more likely to be concurrent with the chair's decision.
Sentiment classification is an important process in understanding people's perception towards a product, service, or topic. Many natural language processing models have been proposed to solve the sentiment classification problem. However, most of them have focused on binary sentiment classification. In this paper, we use a promising deep learning model called BERT to solve the fine-grained sentiment classification task. Experiments show that our model outperforms other popular models for this task without sophisticated architecture. We also demonstrate the effectiveness of transfer learning in natural language processing in the process.
Emotions have often been a crucial part of compelling narratives: literature tells about people with goals, desires, passions, and intentions. In the past, classical literary studies usually scrutinized the affective dimension of literature within the framework of hermeneutics. However, with emergence of the research field known as Digital Humanities (DH) some studies of emotions in literary context have taken a computational turn. Given the fact that DH is still being formed as a science, this direction of research can be rendered relatively new. At the same time, the research in sentiment analysis started in computational linguistic almost two decades ago and is nowadays an established field that has dedicated workshops and tracks in the main computational linguistics conferences. This leads us to the question of what are the commonalities and discrepancies between sentiment analysis research in computational linguistics and digital humanities? In this survey, we offer an overview of the existing body of research on sentiment and emotion analysis as applied to literature. We precede the main part of the survey with a short introduction to natural language processing and machine learning, psychological models of emotions, and provide an overview of existing approaches to sentiment and emotion analysis in computational linguistics. The papers presented in this survey are either coming directly from DH or computational linguistics venues and are limited to sentiment and emotion analysis as applied to literary text.
The importance of building sentiment analysis tools for Arabic social media has been recognized during the past couple of years, especially with the rapid increase in the number of Arabic social media users. One of the main difficulties in tackling this problem is that text within social media is mostly colloquial, with many dialects being used within social media platforms. In this paper, we present a set of features that were integrated with a machine learning based sentiment analysis model and applied on Egyptian, Saudi, Levantine, and MSA Arabic social media datasets. Many of the proposed features were derived through the use of an Arabic Sentiment Lexicon. The model also presents emoticon based features, as well as input text related features such as the number of segments within the text, the length of the text, whether the text ends with a question mark or not, etc. We show that the presented features have resulted in an increased accuracy across six of the seven datasets we've experimented with and which are all benchmarked. Since the developed model out-performs all existing Arabic sentiment analysis systems that have publicly available datasets, we can state that this model presents state-of-the-art in Arabic sentiment analysis.
Social media platforms and online forums generate rapid and increasing amount of textual data. Businesses, government agencies, and media organizations seek to perform sentiment analysis on this rich text data. The results of these analytics are used for adapting marketing strategies, customizing products, security and various other decision makings. Sentiment analysis has been extensively studied and various methods have been developed for it with great success. These methods, however apply to texts written in a specific language. This limits applicability to a limited demographic and a specific geographic region. In this paper we propose a general approach for sentiment analysis on data containing texts from multiple languages. This enables all the applications to utilize the results of sentiment analysis in a language oblivious or language-independent fashion.
Previous work on emotion recognition demonstrated a synergistic effect of combining several modalities such as auditory, visual, and transcribed text to estimate the affective state of a speaker. Among these, the linguistic modality is crucial for the evaluation of an expressed emotion. However, manually transcribed spoken text cannot be given as input to a system practically. We argue that using ground-truth transcriptions during training and evaluation phases leads to a significant discrepancy in performance compared to real-world conditions, as the spoken text has to be recognized on the fly and can contain speech recognition mistakes. In this paper, we propose a method of integrating an automatic speech recognition (ASR) output with a character-level recurrent neural network for sentiment recognition. In addition, we conduct several experiments investigating sentiment recognition for human-robot interaction in a noise-realistic scenario which is challenging for the ASR systems. We quantify the improvement compared to using only the acoustic modality in sentiment recognition. We demonstrate the effectiveness of this approach on the Multimodal Corpus of Sentiment Intensity (MOSI) by achieving 73,6% accuracy in a binary sentiment classification task, exceeding previously reported results that use only acoustic input. In addition, we set a new state-of-the-art performance on the MOSI dataset (80.4% accuracy, 2% absolute improvement).
Domain adaptation tasks such as cross-domain sentiment classification aim to utilize existing labeled data in the source domain and unlabeled or few labeled data in the target domain to improve the performance in the target domain via reducing the shift between the data distributions. Existing cross-domain sentiment classification methods need to distinguish pivots, i.e., the domain-shared sentiment words, and non-pivots, i.e., the domain-specific sentiment words, for excellent adaptation performance. In this paper, we first design a Category Attention Network (CAN), and then propose a model named CAN-CNN to integrate CAN and a Convolutional Neural Network (CNN). On the one hand, the model regards pivots and non-pivots as unified category attribute words and can automatically capture them to improve the domain adaptation performance; on the other hand, the model makes an attempt at interpretability to learn the transferred category attribute words. Specifically, the optimization objective of our model has three different components: 1) the supervised classification loss; 2) the distributions loss of category feature weights; 3) the domain invariance loss. Finally, the proposed model is evaluated on three public sentiment analysis datasets and the results demonstrate that CAN-CNN can outperform other various baseline methods.
The advent of Web 2.0 has led to an increase in the amount of sentimental content available in the Web. Such content is often found in social media web sites in the form of movie or product reviews, user comments, testimonials, messages in discussion forums etc. Timely discovery of the sentimental or opinionated web content has a number of advantages, the most important of all being monetization. Understanding of the sentiments of human masses towards different entities and products enables better services for contextual advertisements, recommendation systems and analysis of market trends. The focus of our project is sentiment focussed web crawling framework to facilitate the quick discovery of sentimental contents of movie reviews and hotel reviews and analysis of the same. We use statistical methods to capture elements of subjective style and the sentence polarity. The paper elaborately discusses two supervised machine learning algorithms: K-Nearest Neighbour(K-NN) and Naive Bayes and compares their overall accuracy, precisions as well as recall values. It was seen that in case of movie reviews Naive Bayes gave far better results than K-NN but for hotel reviews these algorithms gave lesser, almost same accuracies.
In this paper, we explore sentiment composition in phrases that have at least one positive and at least one negative word---phrases like 'happy accident' and 'best winter break'. We compiled a dataset of such opposing polarity phrases and manually annotated them with real-valued scores of sentiment association. Using this dataset, we analyze the linguistic patterns present in opposing polarity phrases. Finally, we apply several unsupervised and supervised techniques of sentiment composition to determine their efficacy on this dataset. Our best system, which incorporates information from the phrase's constituents, their parts of speech, their sentiment association scores, and their embedding vectors, obtains an accuracy of over 80% on the opposing polarity phrases.