Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Mapping Out Narrative Structures and Dynamics Using Networks and Textual Information

Mar 24, 2016
Semi Min, Juyong Park

Human communication is often executed in the form of a narrative, an account of connected events composed of characters, actions, and settings. A coherent narrative structure is therefore a requisite for a well-formulated narrative -- be it fictional or nonfictional -- for informative and effective communication, opening up the possibility of a deeper understanding of a narrative by studying its structural properties. In this paper we present a network-based framework for modeling and analyzing the structure of a narrative, which is further expanded by incorporating methods from computational linguistics to utilize the narrative text. Modeling a narrative as a dynamically unfolding system, we characterize its progression via the growth patterns of the character network, and use sentiment analysis and topic modeling to represent the actual content of the narrative in the form of interaction maps between characters with associated sentiment values and keywords. This is a network framework advanced beyond the simple occurrence-based one most often used until now, allowing one to utilize the unique characteristics of a given narrative to a high degree. Given the ubiquity and importance of narratives, such advanced network-based representation and analysis framework may lead to a more systematic modeling and understanding of narratives for social interactions, expression of human sentiments, and communication.

* 17 pages, 10 figures 

  Access Paper or Ask Questions

COVID-19 Public Opinion and Emotion Monitoring System Based on Time Series Thermal New Word Mining

May 23, 2020
Yixian Zhang, Jieren Chen, Boyi Liu, Yifan Yang, Haocheng Li, Xinyi Zheng, Xi Chen, Tenglong Ren, Naixue Xiong

With the spread and development of new epidemics, it is of great reference value to identify the changing trends of epidemics in public emotions. We designed and implemented the COVID-19 public opinion monitoring system based on time series thermal new word mining. A new word structure discovery scheme based on the timing explosion of network topics and a Chinese sentiment analysis method for the COVID-19 public opinion environment is proposed. Establish a "Scrapy-Redis-Bloomfilter" distributed crawler framework to collect data. The system can judge the positive and negative emotions of the reviewer based on the comments, and can also reflect the depth of the seven emotions such as Hopeful, Happy, and Depressed. Finally, we improved the sentiment discriminant model of this system and compared the sentiment discriminant error of COVID-19 related comments with the Jiagu deep learning model. The results show that our model has better generalization ability and smaller discriminant error. We designed a large data visualization screen, which can clearly show the trend of public emotions, the proportion of various emotion categories, keywords, hot topics, etc., and fully and intuitively reflect the development of public opinion.

  Access Paper or Ask Questions

LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer

May 18, 2021
Machel Reid, Victor Zhong

Many types of text style transfer can be achieved with only small, precise edits (e.g. sentiment transfer from I had a terrible time... to I had a great time...). We propose a coarse-to-fine editor for style transfer that transforms text using Levenshtein edit operations (e.g. insert, replace, delete). Unlike prior single-span edit methods, our method concurrently edits multiple spans in the source text. To train without parallel style text pairs (e.g. pairs of +/- sentiment statements), we propose an unsupervised data synthesis procedure. We first convert text to style-agnostic templates using style classifier attention (e.g. I had a SLOT time...), then fill in slots in these templates using fine-tuned pretrained language models. Our method outperforms existing generation and editing style transfer methods on sentiment (Yelp, Amazon) and politeness (Polite) transfer. In particular, multi-span editing achieves higher performance and more diverse output than single-span editing. Moreover, compared to previous methods on unsupervised data synthesis, our method results in higher quality parallel style pairs and improves model performance.

* ACL-IJCNLP 2021 (Findings) 

  Access Paper or Ask Questions

Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya

Jun 19, 2020
Abrhalei Tela, Abraham Woubie, Ville Hautamaki

In recent years, transformer models have achieved great success in natural language processing (NLP) tasks. Most of the current state-of-the-art NLP results are achieved by using monolingual transformer models, where the model is pre-trained using a single language unlabelled text corpus. Then, the model is fine-tuned to the specific downstream task. However, the cost of pre-training a new transformer model is high for most languages. In this work, we propose a cost-effective transfer learning method to adopt a strong source language model, trained from a large monolingual corpus to a low-resource language. Thus, using XLNet language model, we demonstrate competitive performance with mBERT and a pre-trained target language model on the cross-lingual sentiment (CLS) dataset and on a new sentiment analysis dataset for low-resourced language Tigrinya. With only 10k examples of the given Tigrinya sentiment analysis dataset, English XLNet has achieved 78.88% F1-Score outperforming BERT and mBERT by 10% and 7%, respectively. More interestingly, fine-tuning (English) XLNet model on the CLS dataset has promising results compared to mBERT and even outperformed mBERT for one dataset of the Japanese language.

  Access Paper or Ask Questions

Detecting Sarcasm in Multimodal Social Platforms

Aug 08, 2016
Rossano Schifanella, Paloma de Juan, Joel Tetreault, Liangliang Cao

Sarcasm is a peculiar form of sentiment expression, where the surface sentiment differs from the implied sentiment. The detection of sarcasm in social media platforms has been applied in the past mainly to textual utterances where lexical indicators (such as interjections and intensifiers), linguistic markers, and contextual information (such as user profiles, or past conversations) were used to detect the sarcastic tone. However, modern social media platforms allow to create multimodal messages where audiovisual content is integrated with the text, making the analysis of a mode in isolation partial. In our work, we first study the relationship between the textual and visual aspects in multimodal posts from three major social media platforms, i.e., Instagram, Tumblr and Twitter, and we run a crowdsourcing task to quantify the extent to which images are perceived as necessary by human annotators. Moreover, we propose two different computational frameworks to detect sarcasm that integrate the textual and visual modalities. The first approach exploits visual semantics trained on an external dataset, and concatenates the semantics features with state-of-the-art textual features. The second method adapts a visual neural network initialized with parameters trained on ImageNet to multimodal sarcastic posts. Results show the positive effect of combining modalities for the detection of sarcasm across platforms and methods.

* 10 pages, 3 figures, final version published in the Proceedings of ACM Multimedia 2016 

  Access Paper or Ask Questions

Blind signal decomposition of various word embeddings based on join and individual variance explained

Nov 30, 2020
Yikai Wang, Weijian Li

In recent years, natural language processing (NLP) has become one of the most important areas with various applications in human's life. As the most fundamental task, the field of word embedding still requires more attention and research. Currently, existing works about word embedding are focusing on proposing novel embedding algorithms and dimension reduction techniques on well-trained word embeddings. In this paper, we propose to use a novel joint signal separation method - JIVE to jointly decompose various trained word embeddings into joint and individual components. Through this decomposition framework, we can easily investigate the similarity and difference among different word embeddings. We conducted extensive empirical study on word2vec, FastText and GLoVE trained on different corpus and with different dimensions. We compared the performance of different decomposed components based on sentiment analysis on Twitter and Stanford sentiment treebank. We found that by mapping different word embeddings into the joint component, sentiment performance can be greatly improved for the original word embeddings with lower performance. Moreover, we found that by concatenating different components together, the same model can achieve better performance. These findings provide great insights into the word embeddings and our work offer a new of generating word embeddings by fusing.

* 9 pages, 10 figures 

  Access Paper or Ask Questions

Fiber Bundle Morphisms as a Framework for Modeling Many-to-Many Maps

Mar 15, 2022
Elizabeth Coda, Nico Courts, Colby Wight, Loc Truong, WoongJo Choi, Charles Godfrey, Tegan Emerson, Keerti Kappagantula, Henry Kvinge

While it is not generally reflected in the `nice' datasets used for benchmarking machine learning algorithms, the real-world is full of processes that would be best described as many-to-many. That is, a single input can potentially yield many different outputs (whether due to noise, imperfect measurement, or intrinsic stochasticity in the process) and many different inputs can yield the same output (that is, the map is not injective). For example, imagine a sentiment analysis task where, due to linguistic ambiguity, a single statement can have a range of different sentiment interpretations while at the same time many distinct statements can represent the same sentiment. When modeling such a multivalued function $f: X \rightarrow Y$, it is frequently useful to be able to model the distribution on $f(x)$ for specific input $x$ as well as the distribution on fiber $f^{-1}(y)$ for specific output $y$. Such an analysis helps the user (i) better understand the variance intrinsic to the process they are studying and (ii) understand the range of specific input $x$ that can be used to achieve output $y$. Following existing work which used a fiber bundle framework to better model many-to-one processes, we describe how morphisms of fiber bundles provide a template for building models which naturally capture the structure of many-to-many processes.

  Access Paper or Ask Questions

eDarkTrends: Harnessing Social Media Trends in Substance use disorders for Opioid Listings on Cryptomarket

Mar 29, 2021
Usha Lokala, Francois Lamy, Triyasha Ghosh Dastidar, Kaushik Roy, Raminta Daniulaityte, Srinivasan Parthasarathy, Amit Sheth

Opioid and substance misuse is rampant in the United States today, with the phenomenon known as the opioid crisis. The relationship between substance use and mental health has been extensively studied, with one possible relationship being substance misuse causes poor mental health. However, the lack of evidence on the relationship has resulted in opioids being largely inaccessible through legal means. This study analyzes the substance misuse posts on social media with the opioids being sold through crypto market listings. We use the Drug Abuse Ontology, state-of-the-art deep learning, and BERT-based models to generate sentiment and emotion for the social media posts to understand user perception on social media by investigating questions such as, which synthetic opioids people are optimistic, neutral, or negative about or what kind of drugs induced fear and sorrow or what kind of drugs people love or thankful about or which drug people think negatively about or which opioids cause little to no sentimental reaction. We also perform topic analysis associated with the generated sentiments and emotions to understand which topics correlate with people's responses to various drugs. Our findings can help shape policy to help isolate opioid use cases where timely intervention may be required to prevent adverse consequences, prevent overdose-related deaths, and worsen the epidemic.

* 6 pages, ICLR AI for Public Health Workshop 2021 

  Access Paper or Ask Questions

Coronavirus on Social Media: Analyzing Misinformation in Twitter Conversations

Apr 21, 2020
Karishma Sharma, Sungyong Seo, Chuizheng Meng, Sirisha Rambhatla, Aastha Dua, Yan Liu

The ongoing Coronavirus Disease (COVID-19) pandemic highlights the interconnected-ness of our present-day globalized world. With social distancing policies in place, virtual communication has become an important source of (mis)information. As increasing number of people rely on social media platforms for news, identifying misinformation has emerged as a critical task in these unprecedented times. In addition to being malicious, the spread of such information poses a serious public health risk. To this end, we design a dashboard to track misinformation on popular social media news sharing platform - Twitter. The dashboard allows visibility into the social media discussions around Coronavirus and the quality of information shared on the platform, updated over time. We collect streaming data using the Twitter API from March 1, 2020 to date and identify false, misleading and clickbait contents from collected Tweets. We provide analysis of user accounts and misinformation spread across countries. In addition, we provide analysis of public sentiments on intervention policies such as "#socialdistancing" and "#workfromhome", and we track topics, and emerging hashtags and sentiments over countries. The dashboard maintains an evolving list of misinformation cascades, sentiments and emerging trends over time, accessible online at \url{}.

  Access Paper or Ask Questions