Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Building domain specific lexicon based on TikTok comment dataset

Dec 16, 2020
Hao Jiaxiang

In the sentiment analysis task, predicting the sentiment tendency of a sentence is an important branch. Previous research focused more on sentiment analysis in English, for example, analyzing the sentiment tendency of sentences based on Valence, Arousal, Dominance of sentences. the emotional tendency is different between the two languages. For example, the sentence order between Chinese and English may present different emotions. This paper tried a method that builds a domain-specific lexicon. In this way, the model can classify Chinese words with emotional tendency. In this approach, based on the [13], an ultra-dense space embedding table is trained through word embedding of Chinese TikTok review and emotional lexicon sources(seed words). The result of the model is a domain-specific lexicon, which presents the emotional tendency of words. I collected Chinese TikTok comments as training data. By comparing The training results with the PCA method to evaluate the performance of the model in Chinese sentiment classification, the results show that the model has done well in Chinese. The source code has released on github:

* 10 pages, 5 figures 

  Access Paper or Ask Questions

Stock Index Prediction with Multi-task Learning and Word Polarity Over Time

Aug 17, 2020
Yue Zhou, Kerstin Voigt

Sentiment-based stock prediction systems aim to explore sentiment or event signals from online corpora and attempt to relate the signals to stock price variations. Both the feature-based and neural-networks-based approaches have delivered promising results. However, the frequently minor fluctuations of the stock prices restrict learning the sentiment of text from price patterns, and learning market sentiment from text can be biased if the text is irrelevant to the underlying market. In addition, when using discrete word features, the polarity of a certain term can change over time according to different events. To address these issues, we propose a two-stage system that consists of a sentiment extractor to extract the opinion on the market trend and a summarizer that predicts the direction of the index movement of following week given the opinions of the news over the current week. We adopt BERT with multitask learning which additionally predicts the worthiness of the news and propose a metric called Polarity-Over-Time to extract the word polarity among different event periods. A Weekly-Monday prediction framework and a new dataset, the 10-year Reuters financial news dataset, are also proposed.

* 8 pages 

  Access Paper or Ask Questions

Direct parsing to sentiment graphs

Mar 24, 2022
David Samuel, Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, Erik Velldal

This paper demonstrates how a graph-based semantic parser can be applied to the task of structured sentiment analysis, directly predicting sentiment graphs from text. We advance the state of the art on 4 out of 5 standard benchmark sets. We release the source code, models and predictions.

* Accepted to ACL 2022 

  Access Paper or Ask Questions

Exploiting BERT For Multimodal Target SentimentClassification Through Input Space Translation

Aug 03, 2021
Zaid Khan, Yun Fu

Multimodal target/aspect sentiment classification combines multimodal sentiment analysis and aspect/target sentiment classification. The goal of the task is to combine vision and language to understand the sentiment towards a target entity in a sentence. Twitter is an ideal setting for the task because it is inherently multimodal, highly emotional, and affects real world events. However, multimodal tweets are short and accompanied by complex, possibly irrelevant images. We introduce a two-stream model that translates images in input space using an object-aware transformer followed by a single-pass non-autoregressive text generation approach. We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model. Our approach increases the amount of text available to the language model and distills the object-level information in complex images. We achieve state-of-the-art performance on two multimodal Twitter datasets without modifying the internals of the language model to accept multimodal data, demonstrating the effectiveness of our translation. In addition, we explain a failure mode of a popular approach for aspect sentiment analysis when applied to tweets. Our code is available at \textcolor{blue}{\url{}}.

* ACM Multimedia 2021 

  Access Paper or Ask Questions

Subsentence Extraction from Text Using Coverage-Based Deep Learning Language Models

May 07, 2021
JongYoon Lim, Inkyu Sa, Ho Seok Ahn, Norina Gasteiger, Sanghyub John Lee, Bruce MacDonald

Sentiment prediction remains a challenging and unresolved task in various research fields, including psychology, neuroscience, and computer science. This stems from its high degree of subjectivity and limited input sources that can effectively capture the actual sentiment. This can be even more challenging with only text-based input. Meanwhile, the rise of deep learning and an unprecedented large volume of data have paved the way for artificial intelligence to perform impressively accurate predictions or even human-level reasoning. Drawing inspiration from this, we propose a coverage-based sentiment and subsentence extraction system that estimates a span of input text and recursively feeds this information back to the networks. The predicted subsentence consists of auxiliary information expressing a sentiment. This is an important building block for enabling vivid and epic sentiment delivery (within the scope of this paper) and for other natural language processing tasks such as text summarisation and Q&A. Our approach outperforms the state-of-the-art approaches by a large margin in subsentence prediction (i.e., Average Jaccard scores from 0.72 to 0.89). For the evaluation, we designed rigorous experiments consisting of 24 ablation studies. Finally, our learned lessons are returned to the community by sharing software packages and a public dataset that can reproduce the results presented in this paper.

* MDPI Sensors 2021, 21(8), 2712 
* 27 pages, 16 figures 

  Access Paper or Ask Questions

Fine-Grained Opinion Summarization with Minimal Supervision

Oct 17, 2021
Suyu Ge, Jiaxin Huang, Yu Meng, Sharon Wang, Jiawei Han

Opinion summarization aims to profile a target by extracting opinions from multiple documents. Most existing work approaches the task in a semi-supervised manner due to the difficulty of obtaining high-quality annotation from thousands of documents. Among them, some use aspect and sentiment analysis as a proxy for identifying opinions. In this work, we propose a new framework, FineSum, which advances this frontier in three aspects: (1) minimal supervision, where only aspect names and a few aspect/sentiment keywords are available; (2) fine-grained opinion analysis, where sentiment analysis drills down to the sub-aspect level; and (3) phrase-based summarization, where opinion is summarized in the form of phrases. FineSum automatically identifies opinion phrases from the raw corpus, classifies them into different aspects and sentiments, and constructs multiple fine-grained opinion clusters under each aspect/sentiment. Each cluster consists of semantically coherent phrases, expressing uniform opinions towards certain sub-aspect or characteristics (e.g., positive feelings for ``burgers'' in the ``food'' aspect). An opinion-oriented spherical word embedding space is trained to provide weak supervision for the phrase classifier, and phrase clustering is performed using the aspect-aware contextualized embedding generated from the phrase classifier. Both automatic evaluation on the benchmark and quantitative human evaluation validate the effectiveness of our approach.

  Access Paper or Ask Questions

Country Image in COVID-19 Pandemic: A Case Study of China

Sep 12, 2020
Huimin Chen, Zeyu Zhu, Fanchao Qi, Yining Ye, Zhiyuan Liu, Maosong Sun, Jianbin Jin

Country image has a profound influence on international relations and economic development. In the worldwide outbreak of COVID-19, countries and their people display different reactions, resulting in diverse perceived images among foreign public. Therefore, in this study, we take China as a specific and typical case and investigate its image with aspect-based sentiment analysis on a large-scale Twitter dataset. To our knowledge, this is the first study to explore country image in such a fine-grained way. To perform the analysis, we first build a manually-labeled Twitter dataset with aspect-level sentiment annotations. Afterward, we conduct the aspect-based sentiment analysis with BERT to explore the image of China. We discover an overall sentiment change from non-negative to negative in the general public, and explain it with the increasing mentions of negative ideology-related aspects and decreasing mentions of non-negative fact-based aspects. Further investigations into different groups of Twitter users, including U.S. Congress members, English media, and social bots, reveal different patterns in their attitudes toward China. This study provides a deeper understanding of the changing image of China in COVID-19 pandemic. Our research also demonstrates how aspect-based sentiment analysis can be applied in social science researches to deliver valuable insights.

  Access Paper or Ask Questions

Sentiment Analysis: A Survey

May 11, 2014
Rahul Tejwani

Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. Mining opinions expressed in the user generated content is a challenging yet practically very useful problem. This survey would cover various approaches and methodology used in Sentiment Analysis and Opinion Mining in general. The focus would be on Internet text like, Product review, tweets and other social media.

  Access Paper or Ask Questions

LEBANONUPRISING: a thorough study of Lebanese tweets

Sep 30, 2020
Reda Khalaf, Mireille Makary

Recent studies showed a huge interest in social networks sentiment analysis. Twitter, which is a microblogging service, can be a great source of information on how the users feel about a certain topic, or what their opinion is regarding a social, economic and even political matter. On October 17, Lebanon witnessed the start of a revolution; the LebanonUprising hashtag became viral on Twitter. A dataset consisting of a 100,0000 tweets was collected between 18 and 21 October. In this paper, we conducted a sentiment analysis study for the tweets in spoken Lebanese Arabic related to the LebanonUprising hashtag using different machine learning algorithms. The dataset was manually annotated to measure the precision and recall metrics and to compare between the different algorithms. Furthermore, the work completed in this paper provides two more contributions. The first is related to building a Lebanese to Modern Standard Arabic mapping dictionary that was used for the preprocessing of the tweets and the second is an attempt to move from sentiment analysis to emotion detection using emojis, and the two emotions we tried to predict were the "sarcastic" and "funny" emotions. We built a training set from the tweets collected in October 2019 and then we used this set to predict sentiments and emotions of the tweets we collected between May and August 2020. The analysis we conducted shows the variation in sentiments, emotions and users between the two datasets. The results we obtained seem satisfactory especially considering that there was no previous or similar work done involving Lebanese Arabic tweets, to our knowledge.

* 9 pages, published at the CMLA 2020 conference 

  Access Paper or Ask Questions