Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Opinion Mining on Non-English Short Text

Apr 04, 2017
Esra Akbas

As the type and the number of such venues increase, automated analysis of sentiment on textual resources has become an essential data mining task. In this paper, we investigate the problem of mining opinions on the collection of informal short texts. Both positive and negative sentiment strength of texts are detected. We focus on a non-English language that has few resources for text mining. This approach would help enhance the sentiment analysis in languages where a list of opinionated words does not exist. We propose a new method projects the text into dense and low dimensional feature vectors according to the sentiment strength of the words. We detect the mixture of positive and negative sentiments on a multi-variant scale. Empirical evaluation of the proposed framework on Turkish tweets shows that our approach gets good results for opinion mining.

  Access Paper or Ask Questions

Leveraging Foreign Language Labeled Data for Aspect-Based Opinion Mining

Mar 15, 2020
Nguyen Thi Thanh Thuy, Ngo Xuan Bach, Tu Minh Phuong

Aspect-based opinion mining is the task of identifying sentiment at the aspect level in opinionated text, which consists of two subtasks: aspect category extraction and sentiment polarity classification. While aspect category extraction aims to detect and categorize opinion targets such as product features, sentiment polarity classification assigns a sentiment label, i.e. positive, negative, or neutral, to each identified aspect. Supervised learning methods have been shown to deliver better accuracy for this task but they require labeled data, which is costly to obtain, especially for resource-poor languages like Vietnamese. To address this problem, we present a supervised aspect-based opinion mining method that utilizes labeled data from a foreign language (English in this case), which is translated to Vietnamese by an automated translation tool (Google Translate). Because aspects and opinions in different languages may be expressed by different words, we propose using word embeddings, in addition to other features, to reduce the vocabulary difference between the original and translated texts, thus improving the effectiveness of aspect category extraction and sentiment polarity classification processes. We also introduce an annotated corpus of aspect categories and sentiment polarities extracted from restaurant reviews in Vietnamese, and conduct a series of experiments on the corpus. Experimental results demonstrate the effectiveness of the proposed approach.

  Access Paper or Ask Questions

Review Mining for Feature Based Opinion Summarization and Visualization

Apr 23, 2015
Ahmad Kamal

The application and usage of opinion mining, especially for business intelligence, product recommendation, targeted marketing etc. have fascinated many research attentions around the globe. Various research efforts attempted to mine opinions from customer reviews at different levels of granularity, including word-, sentence-, and document-level. However, development of a fully automatic opinion mining and sentiment analysis system is still elusive. Though the development of opinion mining and sentiment analysis systems are getting momentum, most of them attempt to perform document-level sentiment analysis, classifying a review document as positive, negative, or neutral. Such document-level opinion mining approaches fail to provide insight about users sentiment on individual features of a product or service. Therefore, it seems to be a great help for both customers and manufacturers, if the reviews could be processed at a finer-grained level and presented in a summarized form through some visual means, highlighting individual features of a product and users sentiment expressed over them. In this paper, the design of a unified opinion mining and sentiment analysis framework is presented at the intersection of both machine learning and natural language processing approaches. Also, design of a novel feature-level review summarization scheme is proposed to visualize mined features, opinions and their polarity values in a comprehendible way.

* International Journal of Computer Applications, 119(17), 2015, pp. 6-13 
* 6 pages, 5 figures, 2 tables 

  Access Paper or Ask Questions

Topic Detection and Summarization of User Reviews

May 30, 2020
Pengyuan Li, Lei Huang, Guang-jie Ren

A massive amount of reviews are generated daily from various platforms. It is impossible for people to read through tons of reviews and to obtain useful information. Automatic summarizing customer reviews thus is important for identifying and extracting the essential information to help users to obtain the gist of the data. However, as customer reviews are typically short, informal, and multifaceted, it is extremely challenging to generate topic-wise summarization.While there are several studies aims to solve this issue, they are heuristic methods that are developed only utilizing customer reviews. Unlike existing method, we propose an effective new summarization method by analyzing both reviews and summaries.To do that, we first segment reviews and summaries into individual sentiments. As the sentiments are typically short, we combine sentiments talking about the same aspect into a single document and apply topic modeling method to identify hidden topics among customer reviews and summaries. Sentiment analysis is employed to distinguish positive and negative opinions among each detected topic. A classifier is also introduced to distinguish the writing pattern of summaries and that of customer reviews. Finally, sentiments are selected to generate the summarization based on their topic relevance, sentiment analysis score and the writing pattern. To test our method, a new dataset comprising product reviews and summaries about 1028 products are collected from Amazon and CNET. Experimental results show the effectiveness of our method compared with other methods.

  Access Paper or Ask Questions

Anti-Asian Hate Speech Detection via Data Augmented Semantic Relation Inference

Apr 14, 2022
Jiaxuan Li, Yue Ning

With the spreading of hate speech on social media in recent years, automatic detection of hate speech is becoming a crucial task and has attracted attention from various communities. This task aims to recognize online posts (e.g., tweets) that contain hateful information. The peculiarities of languages in social media, such as short and poorly written content, lead to the difficulty of learning semantics and capturing discriminative features of hate speech. Previous studies have utilized additional useful resources, such as sentiment hashtags, to improve the performance of hate speech detection. Hashtags are added as input features serving either as sentiment-lexicons or extra context information. However, our close investigation shows that directly leveraging these features without considering their context may introduce noise to classifiers. In this paper, we propose a novel approach to leverage sentiment hashtags to enhance hate speech detection in a natural language inference framework. We design a novel framework SRIC that simultaneously performs two tasks: (1) semantic relation inference between online posts and sentiment hashtags, and (2) sentiment classification on these posts. The semantic relation inference aims to encourage the model to encode sentiment-indicative information into representations of online posts. We conduct extensive experiments on two real-world datasets and demonstrate the effectiveness of our proposed framework compared with state-of-the-art representation learning models.

* To appear in Proceedings of the 16th International AAAI Conference on Web and Social Media (ICWSM) 

  Access Paper or Ask Questions

A Simple Approach to Multilingual Polarity Classification in Twitter

Dec 15, 2016
Eric S. Tellez, Sabino Miranda Jiménez, Mario Graff, Daniela Moctezuma, Ranyart R. Suárez, Oscar S. Siordia

Recently, sentiment analysis has received a lot of attention due to the interest in mining opinions of social media users. Sentiment analysis consists in determining the polarity of a given text, i.e., its degree of positiveness or negativeness. Traditionally, Sentiment Analysis algorithms have been tailored to a specific language given the complexity of having a number of lexical variations and errors introduced by the people generating content. In this contribution, our aim is to provide a simple to implement and easy to use multilingual framework, that can serve as a baseline for sentiment analysis contests, and as starting point to build new sentiment analysis systems. We compare our approach in eight different languages, three of them have important international contests, namely, SemEval (English), TASS (Spanish), and SENTIPOLC (Italian). Within the competitions our approach reaches from medium to high positions in the rankings; whereas in the remaining languages our approach outperforms the reported results.

  Access Paper or Ask Questions

A review of sentiment computation methods with R packages

Jan 24, 2019
Maurizio Naldi

Four packages in R are analyzed to carry out sentiment analysis. All packages allow to define custom dictionaries. Just one - Sentiment R - properly accounts for the presence of negators.

* 11 pages 

  Access Paper or Ask Questions

A Multi-task Learning Framework for Opinion Triplet Extraction

Oct 04, 2020
Chen Zhang, Qiuchi Li, Dawei Song, Benyou Wang

The state-of-the-art Aspect-based Sentiment Analysis (ABSA) approaches are mainly based on either detecting aspect terms and their corresponding sentiment polarities, or co-extracting aspect and opinion terms. However, the extraction of aspect-sentiment pairs lacks opinion terms as a reference, while co-extraction of aspect and opinion terms would not lead to meaningful pairs without determining their sentiment dependencies. To address the issue, we present a novel view of ABSA as an opinion triplet extraction task, and propose a multi-task learning framework to jointly extract aspect terms and opinion terms, and simultaneously parses sentiment dependencies between them with a biaffine scorer. At inference phase, the extraction of triplets is facilitated by a triplet decoding method based on the above outputs. We evaluate the proposed framework on four SemEval benchmarks for ASBA. The results demonstrate that our approach significantly outperforms a range of strong baselines and state-of-the-art approaches.

* 10 pages, 4 figures, 3 tables. Accepted to EMNLP 2020 Findings. Repo: 

  Access Paper or Ask Questions

Spinning Sequence-to-Sequence Models with Meta-Backdoors

Jul 22, 2021
Eugene Bagdasaryan, Vitaly Shmatikov

We investigate a new threat to neural sequence-to-sequence (seq2seq) models: training-time attacks that cause models to "spin" their output and support a certain sentiment when the input contains adversary-chosen trigger words. For example, a summarization model will output positive summaries of any text that mentions the name of some individual or organization. We introduce the concept of a "meta-backdoor" to explain model-spinning attacks. These attacks produce models whose output is valid and preserves context, yet also satisfies a meta-task chosen by the adversary (e.g., positive sentiment). Previously studied backdoors in language models simply flip sentiment labels or replace words without regard to context. Their outputs are incorrect on inputs with the trigger. Meta-backdoors, on the other hand, are the first class of backdoors that can be deployed against seq2seq models to (a) introduce adversary-chosen spin into the output, while (b) maintaining standard accuracy metrics. To demonstrate feasibility of model spinning, we develop a new backdooring technique. It stacks the adversarial meta-task (e.g., sentiment analysis) onto a seq2seq model, backpropagates the desired meta-task output (e.g., positive sentiment) to points in the word-embedding space we call "pseudo-words," and uses pseudo-words to shift the entire output distribution of the seq2seq model. Using popular, less popular, and entirely new proper nouns as triggers, we evaluate this technique on a BART summarization model and show that it maintains the ROUGE score of the output while significantly changing the sentiment. We explain why model spinning can be a dangerous technique in AI-powered disinformation and discuss how to mitigate these attacks.

  Access Paper or Ask Questions

Automatic Aggregation by Joint Modeling of Aspects and Values

Jan 23, 2014
Christina Sauper, Regina Barzilay

We present a model for aggregation of product review snippets by joint aspect identification and sentiment analysis. Our model simultaneously identifies an underlying set of ratable aspects presented in the reviews of a product (e.g., sushi and miso for a Japanese restaurant) and determines the corresponding sentiment of each aspect. This approach directly enables discovery of highly-rated or inconsistent aspects of a product. Our generative model admits an efficient variational mean-field inference algorithm. It is also easily extensible, and we describe several modifications and their effects on model structure and inference. We test our model on two tasks, joint aspect identification and sentiment analysis on a set of Yelp reviews and aspect identification alone on a set of medical summaries. We evaluate the performance of the model on aspect identification, sentiment analysis, and per-word labeling accuracy. We demonstrate that our model outperforms applicable baselines by a considerable margin, yielding up to 32% relative error reduction on aspect identification and up to 20% relative error reduction on sentiment analysis.

* Journal Of Artificial Intelligence Research, Volume 46, pages 89-127, 2013 

  Access Paper or Ask Questions