Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Towards a science of human stories: using sentiment analysis and emotional arcs to understand the building blocks of complex social systems

Dec 17, 2017
Andrew J. Reagan

Given the growing assortment of sentiment measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts. Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied, and briefly examine a further 20 methods. We show that while inappropriate for sentences, dictionary-based methods are generally robust in their classification accuracy for longer texts. Stories often following distinct emotional trajectories, forming patterns that are meaningful to us. By classifying the emotional arcs for a filtered subset of 4,803 stories from Project Gutenberg's fiction collection, we find a set of six core trajectories which form the building blocks of complex narratives. Of profound scientific interest will be the degree to which we can eventually understand the full landscape of human stories, and data driven approaches will play a crucial role. Finally, we utilize web-scale data from Twitter to study the limits of what social data can tell us about public health, mental illness, discourse around the protest movement of #BlackLivesMatter, discourse around climate change, and hidden networks. We conclude with a review of published works in complex systems that separately analyze charitable donations, the happiness of words in 10 languages, 100 years of daily temperature data across the United States, and Australian Rules Football games.

* 286 pages, PhD dissertation, University of Vermont (2017) 

  Access Paper or Ask Questions

Product Market Demand Analysis Using NLP in Banglish Text with Sentiment Analysis and Named Entity Recognition

Apr 04, 2022
Md Sabbir Hossain, Nishat Nayla, Annajiat Alim Rasel

Product market demand analysis plays a significant role for originating business strategies due to its noticeable impact on the competitive business field. Furthermore, there are roughly 228 million native Bengali speakers, the majority of whom use Banglish text to interact with one another on social media. Consumers are buying and evaluating items on social media with Banglish text as social media emerges as an online marketplace for entrepreneurs. People use social media to find preferred smartphone brands and models by sharing their positive and bad experiences with them. For this reason, our goal is to gather Banglish text data and use sentiment analysis and named entity identification to assess Bangladeshi market demand for smartphones in order to determine the most popular smartphones by gender. We scraped product related data from social media with instant data scrapers and crawled data from Wikipedia and other sites for product information with python web scrapers. Using Python's Pandas and Seaborn libraries, the raw data is filtered using NLP methods. To train our datasets for named entity recognition, we utilized Spacey's custom NER model, Amazon Comprehend Custom NER. A tensorflow sequential model was deployed with parameter tweaking for sentiment analysis. Meanwhile, we used the Google Cloud Translation API to estimate the gender of the reviewers using the BanglaLinga library. In this article, we use natural language processing (NLP) approaches and several machine learning models to identify the most in-demand items and services in the Bangladeshi market. Our model has an accuracy of 87.99% in Spacy Custom Named Entity recognition, 95.51% in Amazon Comprehend Custom NER, and 87.02% in the Sequential model for demand analysis. After Spacy's study, we were able to manage 80% of mistakes related to misspelled words using a mix of Levenshtein distance and ratio algorithms.

  Access Paper or Ask Questions

A Discussion on Influence of Newspaper Headlines on Social Media

Sep 05, 2019
Aneek Barman Roy, Baolei Chen, Siddharth Tiwari, Zihan Huang

Newspaper headlines contribute severely and have an influence on the social media. This work studies the durability of impact of verbs and adjectives on headlines and determine the factors which are responsible for its nature of influence on the social media. Each headline has been categorized into positive, negative or neutral based on its sentiment score. Initial results show that intensity of a sentiment nature is positively correlated with the social media impression. Additionally, verbs and adjectives show a relation with the sentiment scores

* 13 pages, 11 Figures, 3 Tables 

  Access Paper or Ask Questions

Towards Universality in Multilingual Text Rewriting

Jul 30, 2021
Xavier Garcia, Noah Constant, Mandy Guo, Orhan Firat

In this work, we take the first steps towards building a universal rewriter: a model capable of rewriting text in any language to exhibit a wide variety of attributes, including styles and languages, while preserving as much of the original semantics as possible. In addition to obtaining state-of-the-art results on unsupervised translation, we also demonstrate the ability to do zero-shot sentiment transfer in non-English languages using only English exemplars for sentiment. We then show that our model is able to modify multiple attributes at once, for example adjusting both language and sentiment jointly. Finally, we show that our model is capable of performing zero-shot formality-sensitive translation.

  Access Paper or Ask Questions

Modeling Compositionality with Multiplicative Recurrent Neural Networks

May 02, 2015
Ozan İrsoy, Claire Cardie

We present the multiplicative recurrent neural network as a general model for compositional meaning in language, and evaluate it on the task of fine-grained sentiment analysis. We establish a connection to the previously investigated matrix-space models for compositionality, and show they are special cases of the multiplicative recurrent net. Our experiments show that these models perform comparably or better than Elman-type additive recurrent neural networks and outperform matrix-space models on a standard fine-grained sentiment analysis corpus. Furthermore, they yield comparable results to structural deep models on the recently published Stanford Sentiment Treebank without the need for generating parse trees.

* 10 pages, 2 figures, published at ICLR 2015 

  Access Paper or Ask Questions

Related Tasks can Share! A Multi-task Framework for Affective language

Feb 06, 2020
Kumar Shikhar Deep, Md Shad Akhtar, Asif Ekbal, Pushpak Bhattacharyya

Expressing the polarity of sentiment as 'positive' and 'negative' usually have limited scope compared with the intensity/degree of polarity. These two tasks (i.e. sentiment classification and sentiment intensity prediction) are closely related and may offer assistance to each other during the learning process. In this paper, we propose to leverage the relatedness of multiple tasks in a multi-task learning framework. Our multi-task model is based on convolutional-Gated Recurrent Unit (GRU) framework, which is further assisted by a diverse hand-crafted feature set. Evaluation and analysis suggest that joint-learning of the related tasks in a multi-task framework can outperform each of the individual tasks in the single-task frameworks.

* 12 pages, 3 figures and 3 tables. Accepted in 20th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2019. To be published in Springer LNCS volume 

  Access Paper or Ask Questions

Prior Polarity Lexical Resources for the Italian Language

Jul 01, 2015
Valeria Borzì, Simone Faro, Arianna Pavone, Sabrina Sansone

In this paper we present SABRINA (Sentiment Analysis: a Broad Resource for Italian Natural language Applications) a manually annotated prior polarity lexical resource for Italian natural language applications in the field of opinion mining and sentiment induction. The resource consists in two different sets, an Italian dictionary of more than 277.000 words tagged with their prior polarity value, and a set of polarity modifiers, containing more than 200 words, which can be used in combination with non neutral terms of the dictionary in order to induce the sentiment of Italian compound terms. To the best of our knowledge this is the first prior polarity manually annotated resource which has been developed for the Italian natural language.

* 10 pages, Accepted to NLPCS 2015, the 12th International Workshop on Natural Language Processing and Cognitive Science 

  Access Paper or Ask Questions

Mining Software Quality from Software Reviews: Research Trends and Open Issues

Feb 05, 2016
Issa Atoum, Ahmed Otoom

Software review text fragments have considerably valuable information about users experience. It includes a huge set of properties including the software quality. Opinion mining or sentiment analysis is concerned with analyzing textual user judgments. The application of sentiment analysis on software reviews can find a quantitative value that represents software quality. Although many software quality methods are proposed they are considered difficult to customize and many of them are limited. This article investigates the application of opinion mining as an approach to extract software quality properties. We found that the major issues of software reviews mining using sentiment analysis are due to software lifecycle and the diverse users and teams.

* International Journal of Computer Trends and Technology,Vol. 31, No. 2, Jan 2016 
* 11 pages 

  Access Paper or Ask Questions

Sarcasm Detection: A Comparative Study

Jul 07, 2021
Hamed Yaghoobian, Hamid R. Arabnia, Khaled Rasheed

Sarcasm detection is the task of identifying irony containing utterances in sentiment-bearing text. However, the figurative and creative nature of sarcasm poses a great challenge for affective computing systems performing sentiment analysis. This article compiles and reviews the salient work in the literature of automatic sarcasm detection. Thus far, three main paradigm shifts have occurred in the way researchers have approached this task: 1) semi-supervised pattern extraction to identify implicit sentiment, 2) use of hashtag-based supervision, and 3) incorporation of context beyond target text. In this article, we provide a comprehensive review of the datasets, approaches, trends, and issues in sarcasm and irony detection.

  Access Paper or Ask Questions