Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Evidence of distrust and disorientation towards immunization on online social media after contrasting political communication on vaccines. Results from an analysis of Twitter data in Italy

Feb 10, 2020
Samantha Ajovalasit, Veronica Dorgali, Angelo Mazza, Alberto d' Onofrio, Piero Manfredi

Background. Recently, In Italy the vaccination coverage for key immunizations, as MMR, has been declining, with measles outbreaks. In 2017, the Italian Government expanded the number of mandatory immunizations establishing penalties for families of unvaccinated children. During the 2018 elections campaign, immunization policy entered the political debate, with the government accusing oppositions of fuelling vaccine scepticism. A new government established in 2018 temporarily relaxed penalties and announced the introduction of flexibility. Objectives and Methods. By a sentiment analysis on tweets posted in Italian during 2018, we aimed at (i) characterising the temporal flow of communication on vaccines, (ii) evaluating the usefulness of Twitter data for estimating vaccination parameters, and (iii) investigating whether the ambiguous political communication might have originated disorientation among the public. Results. The population appeared to be mostly composed by "serial twitterers" tweeting about everything including vaccines. Tweets favourable to vaccination accounted for 75% of retained tweets, undecided for 14% and unfavourable for 11%. Twitter activity of the Italian public health institutions was negligible. After smoothing the temporal pattern, an up-and-down trend in the favourable proportion emerged, synchronized with the switch between governments, providing clear evidence of disorientation. Conclusion. The reported evidence of disorientation documents that critical health topics, as immunization, should never be used for political consensus. This is especially true given the increasing role of online social media as information source, which might yield to social pressures eventually harmful for vaccine uptake, and is worsened by the lack of institutional presence on Twitter. This calls for efforts to contrast misinformation and the ensuing spread of hesitancy.

* 15 pages, 3 of appendix, 3 figures, 2 tables 

  Access Paper or Ask Questions

Improving Dataset Distillation

Oct 06, 2019
Ilia Sucholutsky, Matthias Schonlau

Dataset distillation is a method for reducing dataset sizes: the goal is to learn a small number of synthetic samples containing all the information of a large dataset. This has several benefits: speeding up model training in deep learning, reducing energy consumption, and reducing required storage space. Currently, each synthetic sample is assigned a single `hard' label, which limits the accuracies that models trained on distilled datasets can achieve. Also, currently dataset distillation can only be used with image data. We propose to simultaneously distill both images and their labels, and thus to assign each synthetic sample a `soft' label (a distribution of labels) rather than a single `hard' label. Our improved algorithm increases accuracy by 2-4% over the original dataset distillation algorithm for several image classification tasks. For example, training a LeNet model with just 10 distilled images (one per class) results in over 96% accuracy on the MNIST data. Using `soft' labels also enables distilled datasets to consist of fewer samples than there are classes as each sample can encode information for more than one class. For example, we show that LeNet achieves almost 92% accuracy on MNIST after being trained on just 5 distilled images. We also propose an extension of the dataset distillation algorithm that allows it to distill sequential datasets including texts. We demonstrate that text distillation outperforms other methods across multiple datasets. For example, we are able to train models to almost their original accuracy on the IMDB sentiment analysis task using just 20 distilled sentences.

  Access Paper or Ask Questions

Interpretable and Steerable Sequence Learning via Prototypes

Jul 23, 2019
Yao Ming, Panpan Xu, Huamin Qu, Liu Ren

One of the major challenges in machine learning nowadays is to provide predictions with not only high accuracy but also user-friendly explanations. Although in recent years we have witnessed increasingly popular use of deep neural networks for sequence modeling, it is still challenging to explain the rationales behind the model outputs, which is essential for building trust and supporting the domain experts to validate, critique and refine the model. We propose ProSeNet, an interpretable and steerable deep sequence model with natural explanations derived from case-based reasoning. The prediction is obtained by comparing the inputs to a few prototypes, which are exemplar cases in the problem domain. For better interpretability, we define several criteria for constructing the prototypes, including simplicity, diversity, and sparsity and propose the learning objective and the optimization procedure. ProSeNet also provides a user-friendly approach to model steering: domain experts without any knowledge on the underlying model or parameters can easily incorporate their intuition and experience by manually refining the prototypes. We conduct experiments on a wide range of real-world applications, including predictive diagnostics for automobiles, ECG, and protein sequence classification and sentiment analysis on texts. The result shows that ProSeNet can achieve accuracy on par with state-of-the-art deep learning models. We also evaluate the interpretability of the results with concrete case studies. Finally, through user study on Amazon Mechanical Turk (MTurk), we demonstrate that the model selects high-quality prototypes which align well with human knowledge and can be interactively refined for better interpretability without loss of performance.

* Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019 
* Accepted as a full paper at KDD 2019 on May 8, 2019 

  Access Paper or Ask Questions

Morphological Skip-Gram: Using morphological knowledge to improve word representation

Jul 21, 2020
Flávio Santos, Hendrik Macedo, Thiago Bispo, Cleber Zanchettin

Natural language processing models have attracted much interest in the deep learning community. This branch of study is composed of some applications such as machine translation, sentiment analysis, named entity recognition, question and answer, and others. Word embeddings are continuous word representations, they are an essential module for those applications and are generally used as input word representation to the deep learning models. Word2Vec and GloVe are two popular methods to learn word embeddings. They achieve good word representations, however, they learn representations with limited information because they ignore the morphological information of the words and consider only one representation vector for each word. This approach implies that Word2Vec and GloVe are unaware of the word inner structure. To mitigate this problem, the FastText model represents each word as a bag of characters n-grams. Hence, each n-gram has a continuous vector representation, and the final word representation is the sum of its characters n-grams vectors. Nevertheless, the use of all n-grams character of a word is a poor approach since some n-grams have no semantic relation with their words and increase the amount of potentially useless information. This approach also increases the training phase time. In this work, we propose a new method for training word embeddings, and its goal is to replace the FastText bag of character n-grams for a bag of word morphemes through the morphological analysis of the word. Thus, words with similar context and morphemes are represented by vectors close to each other. To evaluate our new approach, we performed intrinsic evaluations considering 15 different tasks, and the results show a competitive performance compared to FastText.

* 11 pages 

  Access Paper or Ask Questions

Deep Learning based Topic Analysis on Financial Emerging Event Tweets

Aug 03, 2020
Shaan Aryaman, Nguwi Yok Yen

Financial analyses of stock markets rely heavily on quantitative approaches in an attempt to predict subsequent or market movements based on historical prices and other measurable metrics. These quantitative analyses might have missed out on un-quantifiable aspects like sentiment and speculation that also impact the market. Analyzing vast amounts of qualitative text data to understand public opinion on social media platform is one approach to address this gap. This work carried out topic analysis on 28264 financial tweets [1] via clustering to discover emerging events in the stock market. Three main topics were discovered to be discussed frequently within the period. First, the financial ratio EPS is a measure that has been discussed frequently by investors. Secondly, short selling of shares were discussed heavily, it was often mentioned together with Morgan Stanley. Thirdly, oil and energy sectors were often discussed together with policy. These tweets were semantically clustered by a method consisting of word2vec algorithm to obtain word embeddings that map words to vectors. Semantic word clusters were then formed. Each tweet was then vectorized using the Term Frequency-Inverse Document Frequency (TF-IDF) values of the words it consisted of and based on which clusters its words were in. Tweet vectors were then converted to compressed representations by training a deep-autoencoder. K-means clusters were then formed. This method reduces dimensionality and produces dense vectors, in contrast to the usual Vector Space Model. Topic modelling with Latent Dirichlet Allocation (LDA) and top frequent words were used to analyze clusters and reveal emerging events.

  Access Paper or Ask Questions

Automated Problem Identification: Regression vs Classification via Evolutionary Deep Networks

Jul 03, 2017
Emmanuel Dufourq, Bruce A. Bassett

Regression or classification? This is perhaps the most basic question faced when tackling a new supervised learning problem. We present an Evolutionary Deep Learning (EDL) algorithm that automatically solves this by identifying the question type with high accuracy, along with a proposed deep architecture. Typically, a significant amount of human insight and preparation is required prior to executing machine learning algorithms. For example, when creating deep neural networks, the number of parameters must be selected in advance and furthermore, a lot of these choices are made based upon pre-existing knowledge of the data such as the use of a categorical cross entropy loss function. Humans are able to study a dataset and decide whether it represents a classification or a regression problem, and consequently make decisions which will be applied to the execution of the neural network. We propose the Automated Problem Identification (API) algorithm, which uses an evolutionary algorithm interface to TensorFlow to manipulate a deep neural network to decide if a dataset represents a classification or a regression problem. We test API on 16 different classification, regression and sentiment analysis datasets with up to 10,000 features and up to 17,000 unique target values. API achieves an average accuracy of $96.3\%$ in identifying the problem type without hardcoding any insights about the general characteristics of regression or classification problems. For example, API successfully identifies classification problems even with 1000 target values. Furthermore, the algorithm recommends which loss function to use and also recommends a neural network architecture. Our work is therefore a step towards fully automated machine learning.

* 9 pages, 6 figures, 4 tables 

  Access Paper or Ask Questions

One Model, Multiple Tasks: Pathways for Natural Language Understanding

Mar 07, 2022
Duyu Tang, Fan Zhang, Yong Dai, Cong Zhou, Shuangzhi Wu, Shuming Shi

This paper presents a Pathways approach to handle many tasks at once. Our approach is general-purpose and sparse. Unlike prevailing single-purpose models that overspecialize at individual tasks and learn from scratch when being extended to new tasks, our approach is general-purpose with the ability of stitching together existing skills to learn new tasks more effectively. Different from traditional dense models that always activate all the model parameters, our approach is sparsely activated: only relevant parts of the model (like pathways through the network) are activated. We take natural language understanding as a case study and define a set of skills like \textit{the skill of understanding the sentiment of text} and \textit{the skill of understanding natural language questions}. These skills can be reused and combined to support many different tasks and situations. We develop our system using Transformer as the backbone. For each skill, we implement skill-specific feed-forward networks, which are activated only if the skill is relevant to the task. An appealing feature of our model is that it not only supports sparsely activated fine-tuning, but also allows us to pretrain skills in the same sparse way with masked language modeling and next sentence prediction. We call this model \textbf{SkillNet}. We have three major findings. First, with only one model checkpoint, SkillNet performs better than task-specific fine-tuning and two multi-task learning baselines (i.e., dense model and Mixture-of-Experts model) on six tasks. Second, sparsely activated pre-training further improves the overall performance. Third, SkillNet significantly outperforms baseline systems when being extended to new tasks.

  Access Paper or Ask Questions

Explaining the Deep Natural Language Processing by Mining Textual Interpretable Features

Jun 12, 2021
Francesco Ventura, Salvatore Greco, Daniele Apiletti, Tania Cerquitelli

Despite the high accuracy offered by state-of-the-art deep natural-language models (e.g. LSTM, BERT), their application in real-life settings is still widely limited, as they behave like a black-box to the end-user. Hence, explainability is rapidly becoming a fundamental requirement of future-generation data-driven systems based on deep-learning approaches. Several attempts to fulfill the existing gap between accuracy and interpretability have been done. However, robust and specialized xAI (Explainable Artificial Intelligence) solutions tailored to deep natural-language models are still missing. We propose a new framework, named T-EBAnO, which provides innovative prediction-local and class-based model-global explanation strategies tailored to black-box deep natural-language models. Given a deep NLP model and the textual input data, T-EBAnO provides an objective, human-readable, domain-specific assessment of the reasons behind the automatic decision-making process. Specifically, the framework extracts sets of interpretable features mining the inner knowledge of the model. Then, it quantifies the influence of each feature during the prediction process by exploiting the novel normalized Perturbation Influence Relation index at the local level and the novel Global Absolute Influence and Global Relative Influence indexes at the global level. The effectiveness and the quality of the local and global explanations obtained with T-EBAnO are proved on (i) a sentiment analysis task performed by a fine-tuned BERT model, and (ii) a toxic comment classification task performed by an LSTM model.

  Access Paper or Ask Questions