Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Learning with Rules

May 22, 2018
Deborah Cohen, Amit Daniely, Amir Globerson, Gal Elidan

Complex classifiers may exhibit "embarassing" failures in cases where a human can easily provide a justified classification. Avoiding such failures is obviously of key importance. In this work, we focus on one such setting, where a label is perfectly predictable if the input contains certain features, and otherwise it is predictable by a linear classifier. We define a hypothesis class that captures this notion and determine its sample complexity. We also give evidence that efficient algorithms cannot achieve this sample complexity. We then derive a simple and efficient algorithm and give evidence that its sample complexity is optimal, among efficient algorithms. Experiments on synthetic and sentiment analysis data demonstrate the efficacy of the method, both in terms of accuracy and interpretability.

  Access Paper or Ask Questions

Estimating the Rating of Reviewers Based on the Text

May 22, 2018
Mohammadamir Kavousi, Sepehr Saadatmand

User-generated texts such as reviews and social media are valuable sources of information. Online reviews are important assets for users to buy a product, see a movie, or make a decision. Therefore, rating of a review is one of the reliable factors for all users to read and trust the reviews. This paper analyzes the texts of the reviews to evaluate and predict the ratings. Moreover, we study the effect of lexical features generated from text as well as sentimental words on the accuracy of rating prediction. Our analysis show that words with high information gain score are more efficient compared to words with high TF-IDF value. In addition, we explore the best number of features for predicting the ratings of the reviews.

* First International Conference on Data Analytics & Learning 2018 
* Accepted in the First International Conference on DATA ANALYTICS & LEARNING 2018 You can find this paper at the above link, paper ID: 76 

  Access Paper or Ask Questions

Multichannel LSTM-CNN for Telugu Technical Domain Identification

Feb 24, 2021
Sunil Gundapu, Radhika Mamidi

With the instantaneous growth of text information, retrieving domain-oriented information from the text data has a broad range of applications in Information Retrieval and Natural language Processing. Thematic keywords give a compressed representation of the text. Usually, Domain Identification plays a significant role in Machine Translation, Text Summarization, Question Answering, Information Extraction, and Sentiment Analysis. In this paper, we proposed the Multichannel LSTM-CNN methodology for Technical Domain Identification for Telugu. This architecture was used and evaluated in the context of the ICON shared task TechDOfication 2020 (task h), and our system got 69.9% of the F1 score on the test dataset and 90.01% on the validation set.

* Paper accepted in The seventeenth International Conference on Natural Language Processing (ICON-2020) 

  Access Paper or Ask Questions

PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models

Jun 16, 2020
Eyal Ben-David, Carmel Rabinovitz, Roi Reichart

Pivot-based neural representation models have lead to significant progress in domain adaptation for NLP. However, previous works that follow this approach utilize only labeled data from the source domain and unlabeled data from the source and target domains, but neglect to incorporate massive unlabeled corpora that are not necessarily drawn from these domains. To alleviate this, we propose PERL: A representation learning model that extends contextualized word embedding models such as BERT with pivot-based fine-tuning. PERL outperforms strong baselines across 22 sentiment classification domain adaptation setups, improves in-domain model performance, yields effective reduced-size models and increases model stability.

* Accepted to TACL in June 2020 

  Access Paper or Ask Questions

A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews

May 28, 2020
Edison Marrese-Taylor, Cristian Rodriguez-Opazo, Jorge A. Balazs, Stephen Gould, Yutaka Matsuo

Despite the recent advances in opinion mining for written reviews, few works have tackled the problem on other sources of reviews. In light of this issue, we propose a multi-modal approach for mining fine-grained opinions from video reviews that is able to determine the aspects of the item under review that are being discussed and the sentiment orientation towards them. Our approach works at the sentence level without the need for time annotations and uses features derived from the audio, video and language transcriptions of its contents. We evaluate our approach on two datasets and show that leveraging the video and audio modalities consistently provides increased performance over text-only baselines, providing evidence these extra modalities are key in better understanding video reviews.

* Second Grand Challenge and Workshop on Multimodal Language ACL 2020 

  Access Paper or Ask Questions

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

May 22, 2018
Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, Ivan Marsic

Multimodal affective computing, learning to recognize and interpret human affects and subjective information from multiple data sources, is still challenging because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract level, ignoring time-dependent interactions between modalities. Addressing such issues, we introduce a hierarchical multimodal architecture with attention and word-level fusion to classify utter-ance-level sentiment and emotion from text and audio data. Our introduced model outperforms the state-of-the-art approaches on published datasets and we demonstrated that our model is able to visualize and interpret the synchronized attention over modalities.

* Accepted by ACL 2018 

  Access Paper or Ask Questions

What's in a Domain? Learning Domain-Robust Text Representations using Adversarial Training

May 16, 2018
Yitong Li, Timothy Baldwin, Trevor Cohn

Most real world language problems require learning from heterogenous corpora, raising the problem of learning robust models which generalise well to both similar (in domain) and dissimilar (out of domain) instances to those seen in training. This requires learning an underlying task, while not learning irrelevant signals and biases specific to individual domains. We propose a novel method to optimise both in- and out-of-domain accuracy based on joint learning of a structured neural model with domain-specific and domain-general components, coupled with adversarial training for domain. Evaluating on multi-domain language identification and multi-domain sentiment analysis, we show substantial improvements over standard domain adaptation techniques, and domain-adversarial training.

* Accepted to NAACL 2018 

  Access Paper or Ask Questions

Mapping Unseen Words to Task-Trained Embedding Spaces

Jun 23, 2016
Pranava Swaroop Madhyastha, Mohit Bansal, Kevin Gimpel, Karen Livescu

We consider the supervised training setting in which we learn task-specific word embeddings. We assume that we start with initial embeddings learned from unlabelled data and update them to learn task-specific embeddings for words in the supervised training data. However, for new words in the test set, we must use either their initial embeddings or a single unknown embedding, which often leads to errors. We address this by learning a neural network to map from initial embeddings to the task-specific embedding space, via a multi-loss objective function. The technique is general, but here we demonstrate its use for improved dependency parsing (especially for sentences with out-of-vocabulary words), as well as for downstream improvements on sentiment analysis.

* 8 + 3 pages, 3 figures 

  Access Paper or Ask Questions

Unified Framework for Quantification

Jun 02, 2016
Aykut Firat

Quantification is the machine learning task of estimating test-data class proportions that are not necessarily similar to those in training. Apart from its intrinsic value as an aggregate statistic, quantification output can also be used to optimize classifier probabilities, thereby increasing classification accuracy. We unify major quantification approaches under a constrained multi-variate regression framework, and use mathematical programming to estimate class proportions for different loss functions. With this modeling approach, we extend existing binary-only quantification approaches to multi-class settings as well. We empirically verify our unified framework by experimenting with several multi-class datasets including the Stanford Sentiment Treebank and CIFAR-10.

* 9 pages, 4 figures 

  Access Paper or Ask Questions

Investigating the Impact of COVID-19 on Education by Social Network Mining

Mar 13, 2022
Mohadese Jamalian, Hamed Vahdat-Nejad, Hamideh Hajiabadi

The Covid-19 virus has been one of the most discussed topics on social networks in 2020 and 2021 and has affected the classic educational paradigm, worldwide. In this research, many tweets related to the Covid-19 virus and education are considered and geo-tagged with the help of the GeoNames geographic database, which contains a large number of place names. To detect the feeling of users, sentiment analysis is performed using the RoBERTa language-based model. Finally, we obtain the trends of frequency of total, positive, and negative tweets for countries with a high number of Covid-19 confirmed cases. Investigating the results reveals a correlation between the trends of tweet frequency and the official statistic of confirmed cases for several countries.

* 6 pages, 1 figures 

  Access Paper or Ask Questions