Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning

Jul 04, 2017
Werner Zellinger, Thomas Grubinger, Edwin Lughofer, Thomas Natschläger, Susanne Saminger-Platz

The learning of domain-invariant representations in the context of domain adaptation with neural networks is considered. We propose a new regularization method that minimizes the discrepancy between domain-specific latent feature representations directly in the hidden activation space. Although some standard distribution matching approaches exist that can be interpreted as the matching of weighted sums of moments, e.g. Maximum Mean Discrepancy (MMD), an explicit order-wise matching of higher order moments has not been considered before. We propose to match the higher order central moments of probability distributions by means of order-wise moment differences. Our model does not require computationally expensive distance and kernel matrix computations. We utilize the equivalent representation of probability distributions by moment sequences to define a new distance function, called Central Moment Discrepancy (CMD). We prove that CMD is a metric on the set of probability distributions on a compact interval. We further prove that convergence of probability distributions on compact intervals w.r.t. the new metric implies convergence in distribution of the respective random variables. We test our approach on two different benchmark data sets for object recognition (Office) and sentiment analysis of product reviews (Amazon reviews). CMD achieves a new state-of-the-art performance on most domain adaptation tasks of Office and outperforms networks trained with MMD, Variational Fair Autoencoders and Domain Adversarial Neural Networks on Amazon reviews. In addition, a post-hoc parameter sensitivity analysis shows that the new approach is stable w.r.t. parameter changes in a certain interval. The source code of the experiments is publicly available.

* Published in ICLR 2017 (conference track) 

  Access Paper or Ask Questions

HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition

Feb 03, 2021
Avihay Chriqui, Inbal Yahav

The use of Bidirectional Encoder Representations from Transformers (BERT) models for different natural language processing (NLP) tasks, and for sentiment analysis in particular, has become very popular in recent years and not in vain. The use of social media is being constantly on the rise. Its impact on all areas of our lives is almost inconceivable. Researches show that social media nowadays serves as one of the main tools where people freely express their ideas, opinions, and emotions. During the current Covid-19 pandemic, the role of social media as a tool to resonate opinions and emotions, became even more prominent. This paper introduces HeBERT and HebEMO. HeBERT is a transformer-based model for modern Hebrew text. Hebrew is considered a Morphological Rich Language (MRL), with unique characteristics that pose a great challenge in developing appropriate Hebrew NLP models. Analyzing multiple specifications of the BERT architecture, we come up with a language model that outperforms all existing Hebrew alternatives on multiple language tasks. HebEMO is a tool that uses HeBERT to detect polarity and extract emotions from Hebrew user-generated content (UGC), which was trained on a unique Covid-19 related dataset that we collected and annotated for this study. Data collection and annotation followed an innovative iterative semi-supervised process that aimed to maximize predictability. HebEMO yielded a high performance of weighted average F1-score = 0.96 for polarity classification. Emotion detection reached an F1-score of 0.78-0.97, with the exception of \textit{surprise}, which the model failed to capture (F1 = 0.41). These results are better than the best-reported performance, even when compared to the English language.

  Access Paper or Ask Questions

A Deep Evolutionary Approach to Bioinspired Classifier Optimisation for Brain-Machine Interaction

Aug 13, 2019
Jordan J. Bird, Diego R. Faria, Luis J. Manso, Anikó Ekárt, Christopher D. Buckingham

This study suggests a new approach to EEG data classification by exploring the idea of using evolutionary computation to both select useful discriminative EEG features and optimise the topology of Artificial Neural Networks. An evolutionary algorithm is applied to select the most informative features from an initial set of 2550 EEG statistical features. Optimisation of a Multilayer Perceptron (MLP) is performed with an evolutionary approach before classification to estimate the best hyperparameters of the network. Deep learning and tuning with Long Short-Term Memory (LSTM) are also explored, and Adaptive Boosting of the two types of models is tested for each problem. Three experiments are provided for comparison using different classifiers: one for attention state classification, one for emotional sentiment classification, and a third experiment in which the goal is to guess the number a subject is thinking of. The obtained results show that an Adaptive Boosted LSTM can achieve an accuracy of 84.44%, 97.06%, and 9.94% on the attentional, emotional, and number datasets, respectively. An evolutionary-optimised MLP achieves results close to the Adaptive Boosted LSTM for the two first experiments and significantly higher for the number-guessing experiment with an Adaptive Boosted DEvo MLP reaching 31.35%, while being significantly quicker to train and classify. In particular, the accuracy of the nonboosted DEvo MLP was of 79.81%, 96.11%, and 27.07% in the same benchmarks. Two datasets for the experiments were gathered using a Muse EEG headband with four electrodes corresponding to TP9, AF7, AF8, and TP10 locations of the international EEG placement standard. The EEG MindBigData digits dataset was gathered from the TP9, FP1, FP2, and TP10 locations.

* 14 pages, 12 figures 

  Access Paper or Ask Questions

Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training

Oct 10, 2018
Bei Liu, Jianlong Fu, Makoto P. Kato, Masatoshi Yoshikawa

Automatic generation of natural language from images has attracted extensive attention. In this paper, we take one step further to investigate generation of poetic language (with multiple lines) to an image for automatic poetry creation. This task involves multiple challenges, including discovering poetic clues from the image (e.g., hope from green), and generating poems to satisfy both relevance to the image and poeticness in language level. To solve the above challenges, we formulate the task of poem generation into two correlated sub-tasks by multi-adversarial training via policy gradient, through which the cross-modal relevance and poetic language style can be ensured. To extract poetic clues from images, we propose to learn a deep coupled visual-poetic embedding, in which the poetic representation from objects, sentiments and scenes in an image can be jointly learned. Two discriminative networks are further introduced to guide the poem generation, including a multi-modal discriminator and a poem-style discriminator. To facilitate the research, we have released two poem datasets by human annotators with two distinct properties: 1) the first human annotated image-to-poem pair dataset (with 8,292 pairs in total), and 2) to-date the largest public English poem corpus dataset (with 92,265 different poems in total). Extensive experiments are conducted with 8K images, among which 1.5K image are randomly picked for evaluation. Both objective and subjective evaluations show the superior performances against the state-of-the-art methods for poem generation from images. Turing test carried out with over 500 human subjects, among which 30 evaluators are poetry experts, demonstrates the effectiveness of our approach.

  Access Paper or Ask Questions

Polarization Measurement of High Dimensional Social Media Messages With Support Vector Machine Algorithm Using Mapreduce

Mar 11, 2015
Ferhat Özgür Çatak

In this article, we propose a new Support Vector Machine (SVM) training algorithm based on distributed MapReduce technique. In literature, there are a lots of research that shows us SVM has highest generalization property among classification algorithms used in machine learning area. Also, SVM classifier model is not affected by correlations of the features. But SVM uses quadratic optimization techniques in its training phase. The SVM algorithm is formulated as quadratic optimization problem. Quadratic optimization problem has $O(m^3)$ time and $O(m^2)$ space complexity, where m is the training set size. The computation time of SVM training is quadratic in the number of training instances. In this reason, SVM is not a suitable classification algorithm for large scale dataset classification. To solve this training problem we developed a new distributed MapReduce method developed. Accordingly, (i) SVM algorithm is trained in distributed dataset individually; (ii) then merge all support vectors of classifier model in every trained node; and (iii) iterate these two steps until the classifier model converges to the optimal classifier function. In the implementation phase, large scale social media dataset is presented in TFxIDF matrix. The matrix is used for sentiment analysis to get polarization value. Two and three class models are created for classification method. Confusion matrices of each classification model are presented in tables. Social media messages corpus consists of 108 public and 66 private universities messages in Turkey. Twitter is used for source of corpus. Twitter user messages are collected using Twitter Streaming API. Results are shown in graphics and tables.

* 12 pages, in Turkish 

  Access Paper or Ask Questions

From Twitter to Traffic Predictor: Next-Day Morning Traffic Prediction Using Social Media Data

Sep 29, 2020
Weiran Yao, Sean Qian

The effectiveness of traditional traffic prediction methods is often extremely limited when forecasting traffic dynamics in early morning. The reason is that traffic can break down drastically during the early morning commute, and the time and duration of this break-down vary substantially from day to day. Early morning traffic forecast is crucial to inform morning-commute traffic management, but they are generally challenging to predict in advance, particularly by midnight. In this paper, we propose to mine Twitter messages as a probing method to understand the impacts of people's work and rest patterns in the evening/midnight of the previous day to the next-day morning traffic. The model is tested on freeway networks in Pittsburgh as experiments. The resulting relationship is surprisingly simple and powerful. We find that, in general, the earlier people rest as indicated from Tweets, the more congested roads will be in the next morning. The occurrence of big events in the evening before, represented by higher or lower tweet sentiment than normal, often implies lower travel demand in the next morning than normal days. Besides, people's tweeting activities in the night before and early morning are statistically associated with congestion in morning peak hours. We make use of such relationships to build a predictive framework which forecasts morning commute congestion using people's tweeting profiles extracted by 5 am or as late as the midnight prior to the morning. The Pittsburgh study supports that our framework can precisely predict morning congestion, particularly for some road segments upstream of roadway bottlenecks with large day-to-day congestion variation. Our approach considerably outperforms those existing methods without Twitter message features, and it can learn meaningful representation of demand from tweeting profiles that offer managerial insights.

  Access Paper or Ask Questions

Emotion Correlation Mining Through Deep Learning Models on Natural Language Text

Jul 28, 2020
Xinzhi Wang, Luyao Kou, Vijayan Sugumaran, Xiangfeng Luo, Hui Zhang

Emotion analysis has been attracting researchers' attention. Most previous works in the artificial intelligence field focus on recognizing emotion rather than mining the reason why emotions are not or wrongly recognized. Correlation among emotions contributes to the failure of emotion recognition. In this paper, we try to fill the gap between emotion recognition and emotion correlation mining through natural language text from web news. Correlation among emotions, expressed as the confusion and evolution of emotion, is primarily caused by human emotion cognitive bias. To mine emotion correlation from emotion recognition through text, three kinds of features and two deep neural network models are presented. The emotion confusion law is extracted through orthogonal basis. The emotion evolution law is evaluated from three perspectives, one-step shift, limited-step shifts, and shortest path transfer. The method is validated using three datasets-the titles, the bodies, and the comments of news articles, covering both objective and subjective texts in varying lengths (long and short). The experimental results show that, in subjective comments, emotions are easily mistaken as anger. Comments tend to arouse emotion circulations of love-anger and sadness-anger. In objective news, it is easy to recognize text emotion as love and cause fear-joy circulation. That means, journalists may try to attract attention using fear and joy words but arouse the emotion love instead; After news release, netizens generate emotional comments to express their intense emotions, i.e., anger, sadness, and love. These findings could provide insights for applications regarding affective interaction such as network public sentiment, social media communication, and human-computer interaction.

  Access Paper or Ask Questions

Beyond Social Media Analytics: Understanding Human Behaviour and Deep Emotion using Self Structuring Incremental Machine Learning

Sep 05, 2020
Tharindu Bandaragoda

This thesis develops a conceptual framework considering social data as representing the surface layer of a hierarchy of human social behaviours, needs and cognition which is employed to transform social data into representations that preserve social behaviours and their causalities. Based on this framework two platforms were built to capture insights from fast-paced and slow-paced social data. For fast-paced, a self-structuring and incremental learning technique was developed to automatically capture salient topics and corresponding dynamics over time. An event detection technique was developed to automatically monitor those identified topic pathways for significant fluctuations in social behaviours using multiple indicators such as volume and sentiment. This platform is demonstrated using two large datasets with over 1 million tweets. The separated topic pathways were representative of the key topics of each entity and coherent against topic coherence measures. Identified events were validated against contemporary events reported in news. Secondly for the slow-paced social data, a suite of new machine learning and natural language processing techniques were developed to automatically capture self-disclosed information of the individuals such as demographics, emotions and timeline of personal events. This platform was trialled on a large text corpus of over 4 million posts collected from online support groups. This was further extended to transform prostate cancer related online support group discussions into a multidimensional representation and investigated the self-disclosed quality of life of patients (and partners) against time, demographics and clinical factors. The capabilities of this extended platform have been demonstrated using a text corpus collected from 10 prostate cancer online support groups comprising of 609,960 prostate cancer discussions and 22,233 patients.

  Access Paper or Ask Questions

Answer Generation for Questions With Multiple Information Sources in E-Commerce

Nov 27, 2021
Anand A. Rajasekar, Nikesh Garera

Automatic question answering is an important yet challenging task in E-commerce given the millions of questions posted by users about the product that they are interested in purchasing. Hence, there is a great demand for automatic answer generation systems that provide quick responses using related information about the product. There are three sources of knowledge available for answering a user posted query, they are reviews, duplicate or similar questions, and specifications. Effectively utilizing these information sources will greatly aid us in answering complex questions. However, there are two main challenges present in exploiting these sources: (i) The presence of irrelevant information and (ii) the presence of ambiguity of sentiment present in reviews and similar questions. Through this work we propose a novel pipeline (MSQAP) that utilizes the rich information present in the aforementioned sources by separately performing relevancy and ambiguity prediction before generating a response. Experimental results show that our relevancy prediction model (BERT-QA) outperforms all other variants and has an improvement of 12.36% in F1 score compared to the BERT-base baseline. Our generation model (T5-QA) outperforms the baselines in all content preservation metrics such as BLEU, ROUGE and has an average improvement of 35.02% in ROUGE and 198.75% in BLEU compared to the highest performing baseline (HSSC-q). Human evaluation of our pipeline shows us that our method has an overall improvement in accuracy of 30.7% over the generation model (T5-QA), resulting in our full pipeline-based approach (MSQAP) providing more accurate answers. To the best of our knowledge, this is the first work in the e-commerce domain that automatically generates natural language answers combining the information present in diverse sources such as specifications, similar questions, and reviews data.

* 7 pages, 9 tables, 1 figure 

  Access Paper or Ask Questions

Learning Using 1-Local Membership Queries

Dec 01, 2015
Galit Bary

Classic machine learning algorithms learn from labelled examples. For example, to design a machine translation system, a typical training set will consist of English sentences and their translation. There is a stronger model, in which the algorithm can also query for labels of new examples it creates. E.g, in the translation task, the algorithm can create a new English sentence, and request its translation from the user during training. This combination of examples and queries has been widely studied. Yet, despite many theoretical results, query algorithms are almost never used. One of the main causes for this is a report (Baum and Lang, 1992) on very disappointing empirical performance of a query algorithm. These poor results were mainly attributed to the fact that the algorithm queried for labels of examples that are artificial, and impossible to interpret by humans. In this work we study a new model of local membership queries (Awasthi et al., 2012), which tries to resolve the problem of artificial queries. In this model, the algorithm is only allowed to query the labels of examples which are close to examples from the training set. E.g., in translation, the algorithm can change individual words in a sentence it has already seen, and then ask for the translation. In this model, the examples queried by the algorithm will be close to natural examples and hence, hopefully, will not appear as artificial or random. We focus on 1-local queries (i.e., queries of distance 1 from an example in the training sample). We show that 1-local membership queries are already stronger than the standard learning model. We also present an experiment on a well known NLP task of sentiment analysis. In this experiment, the users were asked to provide more information than merely indicating the label. We present results that illustrate that this extra information is beneficial in practice.

  Access Paper or Ask Questions