Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Re-Assessing the "Classify and Count" Quantification Method

Nov 04, 2020
Alejandro Moreo, Fabrizio Sebastiani

Learning to quantify (a.k.a.\ quantification) is a task concerned with training unbiased estimators of class prevalence via supervised learning. This task originated with the observation that "Classify and Count" (CC), the trivial method of obtaining class prevalence estimates, is often a biased estimator, and thus delivers suboptimal quantification accuracy; following this observation, several methods for learning to quantify have been proposed that have been shown to outperform CC. In this work we contend that previous works have failed to use properly optimised versions of CC. We thus reassess the real merits of CC (and its variants), and argue that, while still inferior to some cutting-edge methods, they deliver near-state-of-the-art accuracy once (a) hyperparameter optimisation is performed, and (b) this optimisation is performed by using a true quantification loss instead of a standard classification-based loss. Experiments on three publicly available binary sentiment classification datasets support these conclusions.

  Access Paper or Ask Questions

Quantal synaptic dilution enhances sparse encoding and dropout regularisation in deep networks

Sep 28, 2020
Gardave S Bhumbra

Dropout is a technique that silences the activity of units stochastically while training deep networks to reduce overfitting. Here we introduce Quantal Synaptic Dilution (QSD), a biologically plausible model of dropout regularisation based on the quantal properties of neuronal synapses, that incorporates heterogeneities in response magnitudes and release probabilities for vesicular quanta. QSD outperforms standard dropout in ReLU multilayer perceptrons, with enhanced sparse encoding at test time when dropout masks are replaced with identity functions, without shifts in trainable weight or bias distributions. For convolutional networks, the method also improves generalisation in computer vision tasks with and without inclusion of additional forms of regularisation. QSD also outperforms standard dropout in recurrent networks for language modelling and sentiment analysis. An advantage of QSD over many variations of dropout is that it can be implemented generally in all conventional deep networks where standard dropout is applicable.

* 23 pages, 8 figures, including Appendix 

  Access Paper or Ask Questions

Leveraging Affective Bidirectional Transformers for Offensive Language Detection

May 16, 2020
AbdelRahim Elmadany, Chiyu Zhang, Muhammad Abdul-Mageed, Azadeh Hashemi

Social media are pervasive in our life, making it necessary to ensure safe online experiences by detecting and removing offensive and hate speech. In this work, we report our submission to the Offensive Language and hate-speech Detection shared task organized with the 4th Workshop on Open-Source Arabic Corpora and Processing Tools Arabic (OSACT4). We focus on developing purely deep learning systems, without a need for feature engineering. For that purpose, we develop an effective method for automatic data augmentation and show the utility of training both offensive and hate speech models off (i.e., by fine-tuning) previously trained affective models (i.e., sentiment and emotion). Our best models are significantly better than a vanilla BERT model, with 89.60% acc (82.31% macro F1) for hate speech and 95.20% acc (70.51% macro F1) on official TEST data.

  Access Paper or Ask Questions

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Jun 19, 2019
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and achieves state-of-the-art results on 18 tasks including question answering, natural language inference, sentiment analysis, and document ranking.

* Pretrained models and code are available at 

  Access Paper or Ask Questions

Can We Derive Explicit and Implicit Bias from Corpus?

May 31, 2019
Bo Wang, Baixiang Xue, Anthony G. Greenwald

Language is a popular resource to mine speakers' attitude bias, supposing that speakers' statements represent their bias on concepts. However, psychology studies show that people's explicit bias in statements can be different from their implicit bias in mind. Although both explicit and implicit bias are useful for different applications, current automatic techniques do not distinguish them. Inspired by psychological measurements of explicit and implicit bias, we develop an automatic language-based technique to reproduce psychological measurements on large population. By connecting each psychological measurement with the statements containing the certain combination of special words, we derive explicit and implicit bias by understanding the sentiment of corresponding category of statements. Extensive experiments on English and Chinese serious media (Wikipedia) and non-serious media (social media) show that our method successfully reproduce the small-scale psychological observations on large population and achieve new findings.

  Access Paper or Ask Questions

An Unsupervised Approach for Aspect Category Detection Using Soft Cosine Similarity Measure

Dec 08, 2018
Erfan Ghadery, Sajad Movahedi, Heshaam Faili, Azadeh Shakery

Aspect category detection is one of the important and challenging subtasks of aspect-based sentiment analysis. Given a set of pre-defined categories, this task aims to detect categories which are indicated implicitly or explicitly in a given review sentence. Supervised machine learning approaches perform well to accomplish this subtask. Note that, the performance of these methods depends on the availability of labeled train data, which is often difficult and costly to obtain. Besides, most of these supervised methods require feature engineering to perform well. In this paper, we propose an unsupervised method to address aspect category detection task without the need for any feature engineering. Our method utilizes clusters of unlabeled reviews and soft cosine similarity measure to accomplish aspect category detection task. Experimental results on SemEval-2014 restaurant dataset shows that proposed unsupervised approach outperforms several baselines by a substantial margin.

  Access Paper or Ask Questions

Language Style Transfer from Sentences with Arbitrary Unknown Styles

Aug 13, 2018
Yanpeng Zhao, Wei Bi, Deng Cai, Xiaojiang Liu, Kewei Tu, Shuming Shi

Language style transfer is the problem of migrating the content of a source sentence to a target style. In many of its applications, parallel training data are not available and source sentences to be transferred may have arbitrary and unknown styles. First, each sentence is encoded into its content and style latent representations. Then, by recombining the content with the target style, we decode a sentence aligned in the target domain. To adequately constrain the encoding and decoding functions, we couple them with two loss functions. The first is a style discrepancy loss, enforcing that the style representation accurately encodes the style information guided by the discrepancy between the sentence style and the target style. The second is a cycle consistency loss, which ensures that the transferred sentence should preserve the content of the original sentence disentangled from its style. We validate the effectiveness of our model in three tasks: sentiment modification of restaurant reviews, dialog response revision with a romantic style, and sentence rewriting with a Shakespearean style.

  Access Paper or Ask Questions

Causal Analysis of Generic Time Series Data Applied for Market Prediction

Apr 22, 2022
Anton Kolonin, Ali Raheman, Mukul Vishwas, Ikram Ansari, Juan Pinzon, Alice Ho

We explore the applicability of the causal analysis based on temporally shifted (lagged) Pearson correlation applied to diverse time series of different natures in context of the problem of financial market prediction. Theoretical discussion is followed by description of the practical approach for specific environment of time series data with diverse nature and sparsity, as applied for environments of financial markets. The data involves various financial metrics computable from raw market data such as real-time trades and snapshots of the limit order book as well as metrics determined upon social media news streams such as sentiment and different cognitive distortions. The approach is backed up with presentation of algorithmic framework for data acquisition and analysis, concluded with experimental results, and summary pointing out at the possibility to discriminate causal connections between different sorts of real field market data with further discussion on present issues and possible directions of the following work.

* 10 pages, 4 figures, submitted to Artificial General Intelligence 2022 conference 

  Access Paper or Ask Questions

Predicting Above-Sentence Discourse Structure using Distant Supervision from Topic Segmentation

Dec 12, 2021
Patrick Huber, Linzi Xing, Giuseppe Carenini

RST-style discourse parsing plays a vital role in many NLP tasks, revealing the underlying semantic/pragmatic structure of potentially complex and diverse documents. Despite its importance, one of the most prevailing limitations in modern day discourse parsing is the lack of large-scale datasets. To overcome the data sparsity issue, distantly supervised approaches from tasks like sentiment analysis and summarization have been recently proposed. Here, we extend this line of research by exploiting distant supervision from topic segmentation, which can arguably provide a strong and oftentimes complementary signal for high-level discourse structures. Experiments on two human-annotated discourse treebanks confirm that our proposal generates accurate tree structures on sentence and paragraph level, consistently outperforming previous distantly supervised models on the sentence-to-document task and occasionally reaching even higher scores on the sentence-to-paragraph level.

* AAAI 2022 

  Access Paper or Ask Questions

Paradigm Shift in Natural Language Processing

Sep 26, 2021
Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang

In the era of deep learning, modeling for most NLP tasks has converged to several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, NER, Chunking, and adopt the classification paradigm to solve tasks like sentiment analysis. With the rapid progress of pre-trained language models, recent years have observed a rising trend of Paradigm Shift, which is solving one NLP task by reformulating it as another one. Paradigm shift has achieved great success on many tasks, becoming a promising way to improve model performance. Moreover, some of these paradigms have shown great potential to unify a large number of NLP tasks, making it possible to build a single model to handle diverse tasks. In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.


  Access Paper or Ask Questions