Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Improving Few-shot Text Classification via Pretrained Language Representations

Aug 22, 2019
Ningyu Zhang, Zhanlin Sun, Shumin Deng, Jiaoyan Chen, Huajun Chen

Text classification tends to be difficult when the data is deficient or when it is required to adapt to unseen classes. In such challenging scenarios, recent studies have often used meta-learning to simulate the few-shot task, thus negating explicit common linguistic features across tasks. Deep language representations have proven to be very effective forms of unsupervised pretraining, yielding contextualized features that capture linguistic properties and benefit downstream natural language understanding tasks. However, the effect of pretrained language representation for few-shot learning on text classification tasks is still not well understood. In this study, we design a few-shot learning model with pretrained language representations and report the empirical results. We show that our approach is not only simple but also produces state-of-the-art performance on a well-studied sentiment classification dataset. It can thus be further suggested that pretraining could be a promising solution for few shot learning of many other NLP tasks. The code and the dataset to replicate the experiments are made available at

* arXiv admin note: substantial text overlap with arXiv:1902.10482, arXiv:1803.02400 by other authors 

  Access Paper or Ask Questions

Neural Image Captioning

Jul 02, 2019
Elaina Tan, Lakshay Sharma

In recent years, the biggest advances in major Computer Vision tasks, such as object recognition, handwritten-digit identification, facial recognition, and many others., have all come through the use of Convolutional Neural Networks (CNNs). Similarly, in the domain of Natural Language Processing, Recurrent Neural Networks (RNNs), and Long Short Term Memory networks (LSTMs) in particular, have been crucial to some of the biggest breakthroughs in performance for tasks such as machine translation, part-of-speech tagging, sentiment analysis, and many others. These individual advances have greatly benefited tasks even at the intersection of NLP and Computer Vision, and inspired by this success, we studied some existing neural image captioning models that have proven to work well. In this work, we study some existing captioning models that provide near state-of-the-art performances, and try to enhance one such model. We also present a simple image captioning model that makes use of a CNN, an LSTM, and the beam search1 algorithm, and study its performance based on various qualitative and quantitative metrics.

  Access Paper or Ask Questions

Unsupervised Text Style Transfer via Iterative Matching and Translation

Jan 31, 2019
Zhijing Jin, Di Jin, Jonas Mueller, Nicholas Matthews, Enrico Santus

Text style transfer seeks to learn how to automatically rewrite sentences from a source domain to the target domain in different styles, while simultaneously preserving their semantic contents. A major challenge in this task stems from the lack of parallel data that connects the source and target styles. Existing approaches try to disentangle content and style, but this is quite difficult and often results in poor content-preservation and grammaticality. In contrast, we propose a novel approach by first constructing a pseudo-parallel resource that aligns a subset of sentences with similar content between source and target corpus. And then a standard sequence-to-sequence model can be applied to learn the style transfer. Subsequently, we iteratively refine the learned style transfer function while improving upon the imperfections in our original alignment. Our method is applied to the tasks of sentiment modification and formality transfer, where it outperforms state-of-the-art systems by a large margin. As an auxiliary contribution, we produced a publicly-available test set with human-generated style transfers for future community use.

  Access Paper or Ask Questions

User Intent Prediction in Information-seeking Conversations

Jan 11, 2019
Chen Qu, Liu Yang, Bruce Croft, Yongfeng Zhang, Johanne R. Trippas, Minghui Qiu

Conversational assistants are being progressively adopted by the general population. However, they are not capable of handling complicated information-seeking tasks that involve multiple turns of information exchange. Due to the limited communication bandwidth in conversational search, it is important for conversational assistants to accurately detect and predict user intent in information-seeking conversations. In this paper, we investigate two aspects of user intent prediction in an information-seeking setting. First, we extract features based on the content, structural, and sentiment characteristics of a given utterance, and use classic machine learning methods to perform user intent prediction. We then conduct an in-depth feature importance analysis to identify key features in this prediction task. We find that structural features contribute most to the prediction performance. Given this finding, we construct neural classifiers to incorporate context information and achieve better performance without feature engineering. Our findings can provide insights into the important factors and effective methods of user intent prediction in information-seeking conversations.

* Accepted to CHIIR 2019 

  Access Paper or Ask Questions

Backpropagating through Structured Argmax using a SPIGOT

May 12, 2018
Hao Peng, Sam Thomson, Noah A. Smith

We introduce the structured projection of intermediate gradients optimization technique (SPIGOT), a new method for backpropagating through neural networks that include hard-decision structured predictions (e.g., parsing) in intermediate layers. SPIGOT requires no marginal inference, unlike structured attention networks (Kim et al., 2017) and some reinforcement learning-inspired solutions (Yogatama et al., 2017). Like so-called straight-through estimators (Hinton, 2012), SPIGOT defines gradient-like quantities associated with intermediate nondifferentiable operations, allowing backpropagation before and after them; SPIGOT's proxy aims to ensure that, after a parameter update, the intermediate structure will remain well-formed. We experiment on two structured NLP pipelines: syntactic-then-semantic dependency parsing, and semantic parsing followed by sentiment classification. We show that training with SPIGOT leads to a larger improvement on the downstream task than a modularly-trained pipeline, the straight-through estimator, and structured attention, reaching a new state of the art on semantic dependency parsing.

* ACL 2018 

  Access Paper or Ask Questions

Multi-attention Recurrent Network for Human Communication Comprehension

Feb 03, 2018
Amir Zadeh, Paul Pu Liang, Soujanya Poria, Prateek Vij, Erik Cambria, Louis-Philippe Morency

Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape human communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state-of-the-art performance on all the datasets.

* AAAI 2018 Oral Presentation 

  Access Paper or Ask Questions

Dataset of Philippine Presidents Speeches from 1935 to 2016

Nov 12, 2021
John Paul P. Miranda

The dataset was collected to examine and identify possible key topics within these texts. Data preparation such as data cleaning, transformation, tokenization, removal of stop words from both English and Filipino, and word stemming was employed in the dataset before feeding it to sentiment analysis and the LDA model. The topmost occurring word within the dataset is "development" and there are three (3) likely topics from the speeches of Philippine presidents: economic development, enhancement of public services, and addressing challenges. The dataset was able to provide valuable insights contained among official documents. While the study showed that presidents have used their annual address to express their visions for the country. It also presented that the presidents from 1935 to 2016 faced the same problems during their term. Future researchers may collect other speeches made by presidents during their term; combine them to the dataset used in this study to further investigate these important texts by subjecting them to the same methodology used in this study. The dataset may be requested from the authors and it is recommended for further analysis. For example, determine how the speeches of the president reflect the preamble or foundations of the Philippine constitution.

* International Journal of Computing Sciences Research 5 (2021) 1-11 
* 11 pages, 4 figures, 4 tables, dataset 

  Access Paper or Ask Questions

TLA: Twitter Linguistic Analysis

Jul 20, 2021
Tushar Sarkar, Nishant Rajadhyaksha

Linguistics has been instrumental in developing a deeper understanding of human nature. Words are indispensable to bequeath the thoughts, emotions, and purpose of any human interaction, and critically analyzing these words can elucidate the social and psychological behavior and characteristics of these social animals. Social media has become a platform for human interaction on a large scale and thus gives us scope for collecting and using that data for our study. However, this entire process of collecting, labeling, and analyzing this data iteratively makes the entire procedure cumbersome. To make this entire process easier and structured, we would like to introduce TLA(Twitter Linguistic Analysis). In this paper, we describe TLA and provide a basic understanding of the framework and discuss the process of collecting, labeling, and analyzing data from Twitter for a corpus of languages while providing detailed labeled datasets for all the languages and the models are trained on these datasets. The analysis provided by TLA will also go a long way in understanding the sentiments of different linguistic communities and come up with new and innovative solutions for their problems based on the analysis.

  Access Paper or Ask Questions

Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals

Apr 09, 2021
Xi Ye, Rohan Nair, Greg Durrett

Token-level attributions have been extensively studied to explain model predictions for a wide range of classification tasks in NLP (e.g., sentiment analysis), but such explanation techniques are less explored for machine reading comprehension (RC) tasks. Although the transformer-based models used here are identical to those used for classification, the underlying reasoning these models perform is very different and different types of explanations are required. We propose a methodology to evaluate explanations: an explanation should allow us to understand the RC model's high-level behavior with respect to a set of realistic counterfactual input scenarios. We define these counterfactuals for several RC settings, and by connecting explanation techniques' outputs to high-level model behavior, we can evaluate how useful different explanations really are. Our analysis suggests that pairwise explanation techniques are better suited to RC than token-level attributions, which are often unfaithful in the scenarios we consider. We additionally propose an improvement to an attention-based attribution technique, resulting in explanations which better reveal the model's behavior.

  Access Paper or Ask Questions

Quasi Error-free Text Classification and Authorship Recognition in a large Corpus of English Literature based on a Novel Feature Set

Oct 21, 2020
Arthur M. Jacobs, Annette Kinder

The Gutenberg Literary English Corpus (GLEC) provides a rich source of textual data for research in digital humanities, computational linguistics or neurocognitive poetics. However, so far only a small subcorpus, the Gutenberg English Poetry Corpus, has been submitted to quantitative text analyses providing predictions for scientific studies of literature. Here we show that in the entire GLEC quasi error-free text classification and authorship recognition is possible with a method using the same set of five style and five content features, computed via style and sentiment analysis, in both tasks. Our results identify two standard and two novel features (i.e., type-token ratio, frequency, sonority score, surprise) as most diagnostic in these tasks. By providing a simple tool applicable to both short poems and long novels generating quantitative predictions about features that co-determe the cognitive and affective processing of specific text categories or authors, our data pave the way for many future computational and empirical studies of literature or experiments in reading psychology.

* 18 pages, 3 tables 

  Access Paper or Ask Questions