Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Bi-ISCA: Bidirectional Inter-Sentence Contextual Attention Mechanism for Detecting Sarcasm in User Generated Noisy Short Text

Nov 23, 2020
Prakamya Mishra, Saroj Kaushik, Kuntal Dey

Many online comments on social media platforms are hateful, humorous, or sarcastic. The sarcastic nature of these comments (especially the short ones) alters their actual implied sentiments, which leads to misinterpretations by the existing sentiment analysis models. A lot of research has already been done to detect sarcasm in the text using user-based, topical, and conversational information but not much work has been done to use inter-sentence contextual information for detecting the same. This paper proposes a new state-of-the-art deep learning architecture that uses a novel Bidirectional Inter-Sentence Contextual Attention mechanism (Bi-ISCA) to capture inter-sentence dependencies for detecting sarcasm in the user-generated short text using only the conversational context. The proposed deep learning model demonstrates the capability to capture explicit, implicit, and contextual incongruous words & phrases responsible for invoking sarcasm. Bi-ISCA generates state-of-the-art results on two widely used benchmark datasets for the sarcasm detection task (Reddit and Twitter). To the best of our knowledge, none of the existing state-of-the-art models use an inter-sentence contextual attention mechanism to detect sarcasm in the user-generated short text using only conversational context.


  Access Paper or Ask Questions

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

Jan 08, 2020
Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, Rosanne Liu

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM's hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.


  Access Paper or Ask Questions

A Multi-task Learning Model for Chinese-oriented Aspect Polarity Classification and Aspect Term Extraction

Dec 17, 2019
Heng Yang, Biqing Zeng, JianHao Yang, Youwei Song, Ruyang Xu

Aspect-based sentiment analysis (ABSA) task is a multi-grained task of natural language processing and consists of two subtasks: aspect term extraction (ATE) and aspect polarity classification (APC). Most of the existing work focuses on the subtask of aspect term polarity inferring and ignores the significance of aspect term extraction. Besides, the xisting researches do not pay attention to the research of the Chinese-oriented ABSA task. Based on the local context focus (LCF) mechanism, this paper firstly proposes a multi-task learning model for Chineseoriented aspect-based sentiment analysis, namely LCF-ATEPC. Compared with existing models, this model equips the capability of extracting aspect term and inferring aspect term polarity synchronously, moreover, this model is effective to analyze both Chinese and English comments simultaneously and the experiment on a multilingual mixed dataset proved its availability. By integrating the domain-adapted BERT model, the LCF-ATEPC model achieved the state-ofthe-art performance of aspect term extraction and aspect polarity classification in four Chinese review datasets. Besides, the experimental results on the most commonly used SemEval-2014 task4 Restaurant and Laptop datasets outperform the state-of-the-art performance on the ATE subtask.

* Submitted to Elsevier 

  Access Paper or Ask Questions

Plug and Play Language Models: a Simple Approach to Controlled Text Generation

Dec 04, 2019
Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, Rosanne Liu

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM's hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.


  Access Paper or Ask Questions

Effective Quantization Approaches for Recurrent Neural Networks

Feb 07, 2018
Md Zahangir Alom, Adam T Moody, Naoya Maruyama, Brian C Van Essen, Tarek M. Taha

Deep learning, and in particular Recurrent Neural Networks (RNN) have shown superior accuracy in a large variety of tasks including machine translation, language understanding, and movie frame generation. However, these deep learning approaches are very expensive in terms of computation. In most cases, Graphic Processing Units (GPUs) are in used for large scale implementations. Meanwhile, energy efficient RNN approaches are proposed for deploying solutions on special purpose hardware including Field Programming Gate Arrays (FPGAs) and mobile platforms. In this paper, we propose an effective quantization approach for Recurrent Neural Networks (RNN) techniques including Long Short Term Memory (LSTM), Gated Recurrent Units (GRU), and Convolutional Long Short Term Memory (ConvLSTM). We have implemented different quantization methods including Binary Connect {-1, 1}, Ternary Connect {-1, 0, 1}, and Quaternary Connect {-1, -0.5, 0.5, 1}. These proposed approaches are evaluated on different datasets for sentiment analysis on IMDB and video frame predictions on the moving MNIST dataset. The experimental results are compared against the full precision versions of the LSTM, GRU, and ConvLSTM. They show promising results for both sentiment analysis and video frame prediction.

* 8 pages, 23 figures,Submitted to International Joint Conference on Neural Networks (IJCNN) 2018 

  Access Paper or Ask Questions

A Python Package to Detect Anti-Vaccine Users on Twitter

Oct 21, 2021
Matheus Schmitz, Goran Murić, Keith Burghardt

Vaccine hesitancy has a long history but has been recently driven by the anti-vaccine narratives shared online, which significantly degrades the efficacy of vaccination strategies, such as those for COVID-19. Despite broad agreement in the medical community about the safety and efficacy of available vaccines, a large number of social media users continue to be inundated with false information about vaccines and, partly because of this, became indecisive or unwilling to be vaccinated. The goal of this study is to better understand anti-vaccine sentiment, and work to reduce its impact, by developing a system capable of automatically identifying the users responsible for spreading anti-vaccine narratives. We introduce a publicly available Python package capable of analyzing Twitter profiles to assess how likely that profile is to spread anti-vaccine sentiment in the future. The software package is built using text embedding methods, neural networks, and automated dataset generation. It is trained on over one hundred thousand accounts and several million tweets. This model will help researchers and policy-makers understand anti-vaccine discussion and misinformation strategies, which can further help tailor targeted campaigns seeking to inform and debunk the harmful anti-vaccination myths currently being spread. Additionally, we leverage the data on such users to understand what are the moral and emotional characteristics of anti-vaccine spreaders.


  Access Paper or Ask Questions

Representation Stability as a Regularizer for Improved Text Analytics Transfer Learning

Apr 12, 2017
Matthew Riemer, Elham Khabiri, Richard Goodwin

Although neural networks are well suited for sequential transfer learning tasks, the catastrophic forgetting problem hinders proper integration of prior knowledge. In this work, we propose a solution to this problem by using a multi-task objective based on the idea of distillation and a mechanism that directly penalizes forgetting at the shared representation layer during the knowledge integration phase of training. We demonstrate our approach on a Twitter domain sentiment analysis task with sequential knowledge transfer from four related tasks. We show that our technique outperforms networks fine-tuned to the target task. Additionally, we show both through empirical evidence and examples that it does not forget useful knowledge from the source task that is forgotten during standard fine-tuning. Surprisingly, we find that first distilling a human made rule based sentiment engine into a recurrent neural network and then integrating the knowledge with the target task data leads to a substantial gain in generalization performance. Our experiments demonstrate the power of multi-source transfer techniques in practical text analytics problems when paired with distillation. In particular, for the SemEval 2016 Task 4 Subtask A (Nakov et al., 2016) dataset we surpass the state of the art established during the competition with a comparatively simple model architecture that is not even competitive when trained on only the labeled task specific data.


  Access Paper or Ask Questions

<<
117
118
119
120
121
122
123
124
125
126
127
128
129
>>