Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Dec 08, 2020
Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha

Machine Learning has seen tremendous growth recently, which has led to a larger adoption of ML systems for educational assessments, credit risk, healthcare, employment, criminal justice, to name a few. Trustworthiness of ML and NLP systems is a crucial aspect and requires guarantee that the decisions they make are fair and robust. Aligned with this, we propose a framework GYC, to generate a set of counterfactual text samples, which are crucial for testing these ML systems. Our main contributions include a) We introduce GYC, a framework to generate counterfactual samples such that the generation is plausible, diverse, goal-oriented, and effective, b) We generate counterfactual samples, that can direct the generation towards a corresponding condition such as named-entity tag, semantic role label, or sentiment. Our experimental results on various domains show that GYC generates counterfactual text samples exhibiting the above four properties. %The generated counterfactuals can then be fed complementary to the existing data augmentation for improving the debiasing algorithms performance as compared to existing counterfactuals generated by token substitution. GYC generates counterfactuals that can act as test cases to evaluate a model and any text debiasing algorithm.

* Accepted to appear at AAAI 2021 

  Access Paper or Ask Questions

AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts

Nov 07, 2020
Taylor Shin, Yasaman Razeghi, Robert L. Logan IV, Eric Wallace, Sameer Singh

The remarkable success of pretrained language models has motivated the study of what kinds of knowledge these models learn during pretraining. Reformulating tasks as fill-in-the-blanks problems (e.g., cloze tests) is a natural approach for gauging such knowledge, however, its usage is limited by the manual effort and guesswork required to write suitable prompts. To address this, we develop AutoPrompt, an automated method to create prompts for a diverse set of tasks, based on a gradient-guided search. Using AutoPrompt, we show that masked language models (MLMs) have an inherent capability to perform sentiment analysis and natural language inference without additional parameters or finetuning, sometimes achieving performance on par with recent state-of-the-art supervised models. We also show that our prompts elicit more accurate factual knowledge from MLMs than the manually created prompts on the LAMA benchmark, and that MLMs can be used as relation extractors more effectively than supervised relation extraction models. These results demonstrate that automatically generated prompts are a viable parameter-free alternative to existing probing methods, and as pretrained LMs become more sophisticated and capable, potentially a replacement for finetuning.

* v2: Fixed error in Figure 2 

  Access Paper or Ask Questions

Text Classification Using Label Names Only: A Language Model Self-Training Approach

Oct 14, 2020
Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, Jiawei Han

Current text classification methods typically require a good number of human-labeled documents as training data, which can be costly and difficult to obtain in real applications. Humans can perform classification without seeing any labeled examples but only based on a small set of words describing the categories to be classified. In this paper, we explore the potential of only using the label name of each class to train classification models on unlabeled data, without using any labeled documents. We use pre-trained neural language models both as general linguistic knowledge sources for category understanding and as representation learning models for document classification. Our method (1) associates semantically related words with the label names, (2) finds category-indicative words and trains the model to predict their implied categories, and (3) generalizes the model via self-training. We show that our model achieves around 90% accuracy on four benchmark datasets including topic and sentiment classification without using any labeled documents but learning from unlabeled data supervised by at most 3 words (1 in most cases) per class as the label name.

* EMNLP 2020. (Code:

  Access Paper or Ask Questions

Weakly Supervised Attention Networks for Fine-Grained Opinion Mining and Public Health

Sep 30, 2019
Giannis Karamanolakis, Daniel Hsu, Luis Gravano

In many review classification applications, a fine-grained analysis of the reviews is desirable, because different segments (e.g., sentences) of a review may focus on different aspects of the entity in question. However, training supervised models for segment-level classification requires segment labels, which may be more difficult or expensive to obtain than review labels. In this paper, we employ Multiple Instance Learning (MIL) and use only weak supervision in the form of a single label per review. First, we show that when inappropriate MIL aggregation functions are used, then MIL-based networks are outperformed by simpler baselines. Second, we propose a new aggregation function based on the sigmoid attention mechanism and show that our proposed model outperforms the state-of-the-art models for segment-level sentiment classification (by up to 9.8% in F1). Finally, we highlight the importance of fine-grained predictions in an important public-health application: finding actionable reports of foodborne illness. We show that our model achieves 48.6% higher recall compared to previous models, thus increasing the chance of identifying previously unknown foodborne outbreaks.

* Accepted for the 5th Workshop on Noisy User-generated Text (W-NUT 2019), held in conjunction with EMNLP 2019 

  Access Paper or Ask Questions

Leap-LSTM: Enhancing Long Short-Term Memory for Text Categorization

May 28, 2019
Ting Huang, Gehui Shen, Zhi-Hong Deng

Recurrent Neural Networks (RNNs) are widely used in the field of natural language processing (NLP), ranging from text categorization to question answering and machine translation. However, RNNs generally read the whole text from beginning to end or vice versa sometimes, which makes it inefficient to process long texts. When reading a long document for a categorization task, such as topic categorization, large quantities of words are irrelevant and can be skipped. To this end, we propose Leap-LSTM, an LSTM-enhanced model which dynamically leaps between words while reading texts. At each step, we utilize several feature encoders to extract messages from preceding texts, following texts and the current word, and then determine whether to skip the current word. We evaluate Leap-LSTM on several text categorization tasks: sentiment analysis, news categorization, ontology classification and topic classification, with five benchmark data sets. The experimental results show that our model reads faster and predicts better than standard LSTM. Compared to previous models which can also skip words, our model achieves better trade-offs between performance and efficiency.

* Accepted by IJCAI 2019, 7 pages, 3 figures 

  Access Paper or Ask Questions

Exploiting Synchronized Lyrics And Vocal Features For Music Emotion Detection

Jan 15, 2019
Loreto Parisi, Simone Francia, Silvio Olivastri, Maria Stella Tavella

One of the key points in music recommendation is authoring engaging playlists according to sentiment and emotions. While previous works were mostly based on audio for music discovery and playlists generation, we take advantage of our synchronized lyrics dataset to combine text representations and music features in a novel way; we therefore introduce the Synchronized Lyrics Emotion Dataset. Unlike other approaches that randomly exploited the audio samples and the whole text, our data is split according to the temporal information provided by the synchronization between lyrics and audio. This work shows a comparison between text-based and audio-based deep learning classification models using different techniques from Natural Language Processing and Music Information Retrieval domains. From the experiments on audio we conclude that using vocals only, instead of the whole audio data improves the overall performances of the audio classifier. In the lyrics experiments we exploit the state-of-the-art word representations applied to the main Deep Learning architectures available in literature. In our benchmarks the results show how the Bilinear LSTM classifier with Attention based on fastText word embedding performs better than the CNN applied on audio.

* 8 pages, 5 figures, 9 tables 

  Access Paper or Ask Questions

#phramacovigilance - Exploring Deep Learning Techniques for Identifying Mentions of Medication Intake from Twitter

May 16, 2018
Debanjan Mahata, Jasper Friedrichs, Hitkul, Rajiv Ratn Shah

Mining social media messages for health and drug related information has received significant interest in pharmacovigilance research. Social media sites (e.g., Twitter), have been used for monitoring drug abuse, adverse reactions of drug usage and analyzing expression of sentiments related to drugs. Most of these studies are based on aggregated results from a large population rather than specific sets of individuals. In order to conduct studies at an individual level or specific cohorts, identifying posts mentioning intake of medicine by the user is necessary. Towards this objective, we train different deep neural network classification models on a publicly available annotated dataset and study their performances on identifying mentions of personal intake of medicine in tweets. We also design and train a new architecture of a stacked ensemble of shallow convolutional neural network (CNN) ensembles. We use random search for tuning the hyperparameters of the models and share the details of the values taken by the hyperparameters for the best learnt model in different deep neural network architectures. Our system produces state-of-the-art results, with a micro- averaged F-score of 0.693.

  Access Paper or Ask Questions

Connotation Frames: A Data-Driven Investigation

Aug 22, 2016
Hannah Rashkin, Sameer Singh, Yejin Choi

Through a particular choice of a predicate (e.g., "x violated y"), a writer can subtly connote a range of implied sentiments and presupposed facts about the entities x and y: (1) writer's perspective: projecting x as an "antagonist"and y as a "victim", (2) entities' perspective: y probably dislikes x, (3) effect: something bad happened to y, (4) value: y is something valuable, and (5) mental state: y is distressed by the event. We introduce connotation frames as a representation formalism to organize these rich dimensions of connotation using typed relations. First, we investigate the feasibility of obtaining connotative labels through crowdsourcing experiments. We then present models for predicting the connotation frames of verb predicates based on their distributional word representations and the interplay between different types of connotative relations. Empirical results confirm that connotation frames can be induced from various data sources that reflect how people use language and give rise to the connotative meanings. We conclude with analytical results that show the potential use of connotation frames for analyzing subtle biases in online news media.

* 11 pages, published in Proceedings of ACL 2016 

  Access Paper or Ask Questions

Multimodal sparse representation learning and applications

Mar 02, 2016
Miriam Cha, Youngjune Gwon, H. T. Kung

Unsupervised methods have proven effective for discriminative tasks in a single-modality scenario. In this paper, we present a multimodal framework for learning sparse representations that can capture semantic correlation between modalities. The framework can model relationships at a higher level by forcing the shared sparse representation. In particular, we propose the use of joint dictionary learning technique for sparse coding and formulate the joint representation for concision, cross-modal representations (in case of a missing modality), and union of the cross-modal representations. Given the accelerated growth of multimodal data posted on the Web such as YouTube, Wikipedia, and Twitter, learning good multimodal features is becoming increasingly important. We show that the shared representations enabled by our framework substantially improve the classification performance under both unimodal and multimodal settings. We further show how deep architectures built on the proposed framework are effective for the case of highly nonlinear correlations between modalities. The effectiveness of our approach is demonstrated experimentally in image denoising, multimedia event detection and retrieval on the TRECVID dataset (audio-video), category classification on the Wikipedia dataset (image-text), and sentiment classification on PhotoTweet (image-text).

  Access Paper or Ask Questions