Alert button

"Sentiment Analysis": models, code, and papers
Alert button

Machine Translation for Accessible Multi-Language Text Analysis

Jan 20, 2023
Edward W. Chew, William D. Weisman, Jingying Huang, Seth Frey

Figure 1 for Machine Translation for Accessible Multi-Language Text Analysis

English is the international standard of social research, but scholars are increasingly conscious of their responsibility to meet the need for scholarly insight into communication processes globally. This tension is as true in computational methods as any other area, with revolutionary advances in the tools for English language texts leaving most other languages far behind. In this paper, we aim to leverage those very advances to demonstrate that multi-language analysis is currently accessible to all computational scholars. We show that English-trained measures computed after translation to English have adequate-to-excellent accuracy compared to source-language measures computed on original texts. We show this for three major analytics -- sentiment analysis, topic analysis, and word embeddings -- over 16 languages, including Spanish, Chinese, Hindi, and Arabic. We validate this claim by comparing predictions on original language tweets and their backtranslations: double translations from their source language to English and back to the source language. Overall, our results suggest that Google Translate, a simple and widely accessible tool, is effective in preserving semantic content across languages and methods. Modern machine translation can thus help computational scholars make more inclusive and general claims about human communication.

* 5000 words, 6 figures 
Viaarxiv icon

Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis

Aug 05, 2022
Jia Li, Ziyang Zhang, Junjie Lang, Yueqi Jiang, Liuwei An, Peng Zou, Yangyang Xu, Sheng Gao, Jie Lin, Chunxiao Fan, Xiao Sun, Meng Wang

Figure 1 for Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis
Figure 2 for Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis
Figure 3 for Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis
Figure 4 for Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis

In this paper, we present our solutions for the Multimodal Sentiment Analysis Challenge (MuSe) 2022, which includes MuSe-Humor, MuSe-Reaction and MuSe-Stress Sub-challenges. The MuSe 2022 focuses on humor detection, emotional reactions and multimodal emotional stress utilising different modalities and data sets. In our work, different kinds of multimodal features are extracted, including acoustic, visual, text and biological features. These features are fused by TEMMA and GRU with self-attention mechanism frameworks. In this paper, 1) several new audio features, facial expression features and paragraph-level text embeddings are extracted for accuracy improvement. 2) we substantially improve the accuracy and reliability for multimodal sentiment prediction by mining and blending the multimodal features. 3) effective data augmentation strategies are applied in model training to alleviate the problem of sample imbalance and prevent the model form learning biased subject characters. For the MuSe-Humor sub-challenge, our model obtains the AUC score of 0.8932. For the MuSe-Reaction sub-challenge, the Pearson's Correlations Coefficient of our approach on the test set is 0.3879, which outperforms all other participants. For the MuSe-Stress sub-challenge, our approach outperforms the baseline in both arousal and valence on the test dataset, reaching a final combined result of 0.5151.

* 8 pages, 2 figures, to appear in MuSe 2022 (ACM MM2022 co-located workshop) 
Viaarxiv icon

Sparse Fuzzy Attention for Structured Sentiment Analysis

Sep 25, 2021
Letian Peng, Zuchao Li, Hai Zhao

Figure 1 for Sparse Fuzzy Attention for Structured Sentiment Analysis
Figure 2 for Sparse Fuzzy Attention for Structured Sentiment Analysis
Figure 3 for Sparse Fuzzy Attention for Structured Sentiment Analysis
Figure 4 for Sparse Fuzzy Attention for Structured Sentiment Analysis

Attention scorers have achieved success in parsing tasks like semantic and syntactic dependency parsing. However, in tasks modeled into parsing, like structured sentiment analysis, "dependency edges" are very sparse which hinders parser performance. Thus we propose a sparse and fuzzy attention scorer with pooling layers which improves parser performance and sets the new state-of-the-art on structured sentiment analysis. We further explore the parsing modeling on structured sentiment analysis with second-order parsing and introduce a novel sparse second-order edge building procedure that leads to significant improvement in parsing performance.

Viaarxiv icon

A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis

Apr 11, 2022
Ehsan Hosseini-Asl, Wenhao Liu, Caiming Xiong

Figure 1 for A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis
Figure 2 for A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis
Figure 3 for A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis
Figure 4 for A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis

Sentiment analysis is an important task in natural language processing. In recent works, pre-trained language models are often used to achieve state-of-the-art results, especially when training data is scarce. It is common to fine-tune on the downstream task, usually by adding task-specific layers on top of the model. In this paper, we focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities. In particular, we are interested in few-shot settings. We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention (GPT2 is used unless stated otherwise). This way, the model learns to accomplish the tasks via language generation without the need of training task-specific layers. Our evaluation results on the single-task polarity prediction show that our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings. More importantly, our generative approach significantly reduces the model variance caused by low-resource data. We further demonstrate that the proposed generative language model can handle joint and multi-task settings, unlike previous work. We observe that the proposed sequence generation method achieves further improved performances on polarity prediction when the model is trained via joint and multi-task settings. Further evaluation on similar sentiment analysis datasets, SST-2, SST- and OOS intent detection validates the superiority and noise robustness of generative language model in few-shot settings.

* Accepted to Findings of NAACL 2022 
Viaarxiv icon

MEGAnno: Exploratory Labeling for NLP in Computational Notebooks

Jan 08, 2023
Dan Zhang, Hannah Kim, Rafael Li Chen, Eser Kandogan, Estevam Hruschka

Figure 1 for MEGAnno: Exploratory Labeling for NLP in Computational Notebooks
Figure 2 for MEGAnno: Exploratory Labeling for NLP in Computational Notebooks
Figure 3 for MEGAnno: Exploratory Labeling for NLP in Computational Notebooks
Figure 4 for MEGAnno: Exploratory Labeling for NLP in Computational Notebooks

We present MEGAnno, a novel exploratory annotation framework designed for NLP researchers and practitioners. Unlike existing labeling tools that focus on data labeling only, our framework aims to support a broader, iterative ML workflow including data exploration and model development. With MEGAnno's API, users can programmatically explore the data through sophisticated search and automated suggestion functions and incrementally update task schema as their project evolve. Combined with our widget, the users can interactively sort, filter, and assign labels to multiple items simultaneously in the same notebook where the rest of the NLP project resides. We demonstrate MEGAnno's flexible, exploratory, efficient, and seamless labeling experience through a sentiment analysis use case.

* Data Science with Human-in-the-loop (DaSH) @ EMNLP 2022. Demo: https://meganno.github.io 
Viaarxiv icon

Sentiment Analysis for Measuring Hope and Fear from Reddit Posts During the 2022 Russo-Ukrainian Conflict

Jan 19, 2023
Alessio Guerra, Oktay Karakuş

Figure 1 for Sentiment Analysis for Measuring Hope and Fear from Reddit Posts During the 2022 Russo-Ukrainian Conflict
Figure 2 for Sentiment Analysis for Measuring Hope and Fear from Reddit Posts During the 2022 Russo-Ukrainian Conflict
Figure 3 for Sentiment Analysis for Measuring Hope and Fear from Reddit Posts During the 2022 Russo-Ukrainian Conflict
Figure 4 for Sentiment Analysis for Measuring Hope and Fear from Reddit Posts During the 2022 Russo-Ukrainian Conflict

This paper proposes a novel lexicon-based unsupervised sentimental analysis method to measure the $``\textit{hope}"$ and $``\textit{fear}"$ for the 2022 Ukrainian-Russian Conflict. $\textit{Reddit.com}$ is utilised as the main source of human reactions to daily events during nearly the first three months of the conflict. The top 50 $``hot"$ posts of six different subreddits about Ukraine and news (Ukraine, worldnews, Ukraina, UkrainianConflict, UkraineWarVideoReport, UkraineWarReports) and their relative comments are scraped and a data set is created. On this corpus, multiple analyses such as (1) public interest, (2) hope/fear score, (3) stock price interaction are employed. We promote using a dictionary approach, which scores the hopefulness of every submitted user post. The Latent Dirichlet Allocation (LDA) algorithm of topic modelling is also utilised to understand the main issues raised by users and what are the key talking points. Experimental analysis shows that the hope strongly decreases after the symbolic and strategic losses of Azovstal (Mariupol) and Severodonetsk. Spikes in hope/fear, both positives and negatives, are present after important battles, but also some non-military events, such as Eurovision and football games.

* 23 pages, 8 figures, 2 tables 
Viaarxiv icon

Sentiment-based Engagement Strategies for intuitive Human-Robot Interaction

Jan 10, 2023
Thorsten Hempel, Laslo Dinges, Ayoub Al-Hamadi

Figure 1 for Sentiment-based Engagement Strategies for intuitive Human-Robot Interaction
Figure 2 for Sentiment-based Engagement Strategies for intuitive Human-Robot Interaction
Figure 3 for Sentiment-based Engagement Strategies for intuitive Human-Robot Interaction
Figure 4 for Sentiment-based Engagement Strategies for intuitive Human-Robot Interaction

Emotion expressions serve as important communicative signals and are crucial cues in intuitive interactions between humans. Hence, it is essential to include these fundamentals in robotic behavior strategies when interacting with humans to promote mutual understanding and to reduce misjudgements. We tackle this challenge by detecting and using the emotional state and attention for a sentiment analysis of potential human interaction partners to select well-adjusted engagement strategies. This way, we pave the way for more intuitive human-robot interactions, as the robot's action conforms to the person's mood and expectation. We propose four different engagement strategies with implicit and explicit communication techniques that we implement on a mobile robot platform for initial experiments.

* Camera ready version - 18th International Conference on Computer Vision Theory and Applications (VISAPP 2023) 
Viaarxiv icon

Sentiment Classification of Code-Switched Text using Pre-trained Multilingual Embeddings and Segmentation

Oct 29, 2022
Saurav K. Aryal, Howard Prioleau, Gloria Washington

Figure 1 for Sentiment Classification of Code-Switched Text using Pre-trained Multilingual Embeddings and Segmentation
Figure 2 for Sentiment Classification of Code-Switched Text using Pre-trained Multilingual Embeddings and Segmentation
Figure 3 for Sentiment Classification of Code-Switched Text using Pre-trained Multilingual Embeddings and Segmentation

With increasing globalization and immigration, various studies have estimated that about half of the world population is bilingual. Consequently, individuals concurrently use two or more languages or dialects in casual conversational settings. However, most research is natural language processing is focused on monolingual text. To further the work in code-switched sentiment analysis, we propose a multi-step natural language processing algorithm utilizing points of code-switching in mixed text and conduct sentiment analysis around those identified points. The proposed sentiment analysis algorithm uses semantic similarity derived from large pre-trained multilingual models with a handcrafted set of positive and negative words to determine the polarity of code-switched text. The proposed approach outperforms a comparable baseline model by 11.2% for accuracy and 11.64% for F1-score on a Spanish-English dataset. Theoretically, the proposed algorithm can be expanded for sentiment analysis of multiple languages with limited human expertise.

Viaarxiv icon

TEDB System Description to a Shared Task on Euphemism Detection 2022

Jan 16, 2023
Peratham Wiriyathammabhum

Figure 1 for TEDB System Description to a Shared Task on Euphemism Detection 2022
Figure 2 for TEDB System Description to a Shared Task on Euphemism Detection 2022
Figure 3 for TEDB System Description to a Shared Task on Euphemism Detection 2022
Figure 4 for TEDB System Description to a Shared Task on Euphemism Detection 2022

In this report, we describe our Transformers for euphemism detection baseline (TEDB) submissions to a shared task on euphemism detection 2022. We cast the task of predicting euphemism as text classification. We considered Transformer-based models which are the current state-of-the-art methods for text classification. We explored different training schemes, pretrained models, and model architectures. Our best result of 0.816 F1-score (0.818 precision and 0.814 recall) consists of a euphemism-detection-finetuned TweetEval/TimeLMs-pretrained RoBERTa model as a feature extractor frontend with a KimCNN classifier backend trained end-to-end using a cosine annealing scheduler. We observed pretrained models on sentiment analysis and offensiveness detection to correlate with more F1-score while pretraining on other tasks, such as sarcasm detection, produces less F1-scores. Also, putting more word vector channels does not improve the performance in our experiments.

* EMNLP workshop 2022 SharedTask report. FigLang 2022 
Viaarxiv icon