Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis

Sep 09, 2016
Sebastian Ruder, Parsa Ghaffari, John G. Breslin

Opinion mining from customer reviews has become pervasive in recent years. Sentences in reviews, however, are usually classified independently, even though they form part of a review's argumentative structure. Intuitively, sentences in a review build and elaborate upon each other; knowledge of the review structure and sentential context should thus inform the classification of each sentence. We demonstrate this hypothesis for the task of aspect-based sentiment analysis by modeling the interdependencies of sentences in a review with a hierarchical bidirectional LSTM. We show that the hierarchical model outperforms two non-hierarchical baselines, obtains results competitive with the state-of-the-art, and outperforms the state-of-the-art on five multilingual, multi-domain datasets without any hand-engineered features or external resources.

* To be published at EMNLP 2016, 7 pages 

  Access Paper or Ask Questions

Credibility Adjusted Term Frequency: A Supervised Term Weighting Scheme for Sentiment Analysis and Text Classification

Jun 28, 2014
Yoon Kim, Owen Zhang

We provide a simple but novel supervised weighting scheme for adjusting term frequency in tf-idf for sentiment analysis and text classification. We compare our method to baseline weighting schemes and find that it outperforms them on multiple benchmarks. The method is robust and works well on both snippets and longer documents.

* Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. June, 2014. 79--83 

  Access Paper or Ask Questions

What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation

Mar 25, 2022
Sarik Ghazarian, Behnam Hedayatnia, Alexandros Papangelis, Yang Liu, Dilek Hakkani-Tur

Accurate automatic evaluation metrics for open-domain dialogs are in high demand. Existing model-based metrics for system response evaluation are trained on human annotated data, which is cumbersome to collect. In this work, we propose to use information that can be automatically extracted from the next user utterance, such as its sentiment or whether the user explicitly ends the conversation, as a proxy to measure the quality of the previous system response. This allows us to train on a massive set of dialogs with weak supervision, without requiring manual system turn quality annotations. Experiments show that our model is comparable to models trained on human annotated data. Furthermore, our model generalizes across both spoken and written open-domain dialog corpora collected from real and paid users.

* Accepted at ACL Findings 2022. 11 pages, 8 figures, 5 tables 

  Access Paper or Ask Questions

Sentiment analysis model for Twitter data in Polish language

Nov 03, 2019
Karol Chlasta

Text mining analysis of tweets gathered during Polish presidential election on May 10th, 2015. The project included implementation of engine to retrieve information from Twitter, building document corpora, corpora cleaning, and creating Term-Document Matrix. Each tweet from the text corpora was assigned a category based on its sentiment score. The score was calculated using the number of positive and/or negative emoticons and Polish words in each document. The result data set was used to train and test four machine learning classifiers, to select these providing most accurate automatic tweet classification results. The Naive Bayes and Maximum Entropy algorithms achieved the best accuracy of respectively 71.76% and 77.32%. All implementation tasks were completed using R programming language.

  Access Paper or Ask Questions

Learning with Noisy Labels for Sentence-level Sentiment Classification

Aug 31, 2019
Hao Wang, Bing Liu, Chaozhuo Li, Yan Yang, Tianrui Li

Deep neural networks (DNNs) can fit (or even over-fit) the training data very well. If a DNN model is trained using data with noisy labels and tested on data with clean labels, the model may perform poorly. This paper studies the problem of learning with noisy labels for sentence-level sentiment classification. We propose a novel DNN model called NetAb (as shorthand for convolutional neural Networks with Ab-networks) to handle noisy labels during training. NetAb consists of two convolutional neural networks, one with a noise transition layer for dealing with the input noisy labels and the other for predicting 'clean' labels. We train the two networks using their respective loss functions in a mutual reinforcement manner. Experimental results demonstrate the effectiveness of the proposed model.

* to appear in EMNLP-IJCNLP 2019 

  Access Paper or Ask Questions

Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study

Dec 19, 2017
Rasoul Kaljahi, Jennifer Foster

Any-gram kernels are a flexible and efficient way to employ bag-of-n-gram features when learning from textual data. They are also compatible with the use of word embeddings so that word similarities can be accounted for. While the original any-gram kernels are implemented on top of tree kernels, we propose a new approach which is independent of tree kernels and is more efficient. We also propose a more effective way to make use of word embeddings than the original any-gram formulation. When applied to the task of sentiment classification, our new formulation achieves significantly better performance.

  Access Paper or Ask Questions

Universal, Unsupervised (Rule-Based), Uncovered Sentiment Analysis

Jan 05, 2017
David Vilares, Carlos Gómez-Rodríguez, Miguel A. Alonso

We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules. On the one hand, we exploit some of the main advantages of unsupervised algorithms: (1) the interpretability of their output, in contrast with most supervised models, which behave as a black box and (2) their robustness across different corpora and domains. On the other hand, by introducing the concept of compositional operations and exploiting syntactic information in the form of universal dependencies, we tackle one of their main drawbacks: their rigidity on data that are structured differently depending on the language concerned. Experiments show an improvement both over existing unsupervised methods, and over state-of-the-art supervised models when evaluating outside their corpus of origin. Experiments also show how the same compositional operations can be shared across languages. The system is available at

* Knowledge-Based Systems, 118:45-55, 2017 
* 19 pages, 5 Tables, 6 Figures. This is the authors version of a work that was accepted for publication in Knowledge-Based Systems 

  Access Paper or Ask Questions

Sentiment Classification using N-gram IDF and Automated Machine Learning

May 25, 2019
Rungroj Maipradit, Hideaki Hata, Kenichi Matsumoto

We propose a sentiment classification method with a general machine learning framework. For feature representation, n-gram IDF is used to extract software-engineering-related, dataset-specific, positive, neutral, and negative n-gram expressions. For classifiers, an automated machine learning tool is used. In the comparison using publicly available datasets, our method achieved the highest F1 values in positive and negative sentences on all datasets.

* 4 pages, IEEE Software 

  Access Paper or Ask Questions

An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese

Jun 07, 2019
Enkhbold Bataa, Joshua Wu

Text classification approaches have usually required task-specific model architectures and huge labeled datasets. Recently, thanks to the rise of text-based transfer learning techniques, it is possible to pre-train a language model in an unsupervised manner and leverage them to perform effective on downstream tasks. In this work we focus on Japanese and show the potential use of transfer learning techniques in text classification. Specifically, we perform binary and multi-class sentiment classification on the Rakuten product review and Yahoo movie review datasets. We show that transfer learning-based approaches perform better than task-specific models trained on 3 times as much data. Furthermore, these approaches perform just as well for language modeling pre-trained on only 1/30 of the data. We release our pre-trained models and code as open source.

  Access Paper or Ask Questions