Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

An Effort to Measure Customer Relationship Performance in Indonesia's Fintech Industry

Feb 16, 2021
Alisya Putri Rabbani, Andry Alamsyah, Sri Widiyanesti

The availability of social media simplifies the companies-customers relationship. An effort to engage customers in conversation networks using social media is called Social Customer Relationship Management (SCRM). Social Network Analysis helps to understand network characteristics and how active the conversation network on social media. Calculating its network properties is beneficial for measuring customer relationship performance. Financial Technology, a new emerging industry that provides digital-based financial services utilize social media to interact with its customers. Measuring SCRM performance is needed in order to stay competitive among others. Therefore, we aim to explore the SCRM performance of the Indonesia Fintech company. In terms of discovering the market majority thought in conversation networks, we perform sentiment analysis by classifying into positive and negative opinion. As case studies, we investigate Twitter conversations about GoPay, OVO, Dana, and LinkAja during the observation period from 1st October until 1st November 2019. The result of this research is beneficial for business intelligence purposes especially in managing relationships with customers.

* The 11th SCBTII 2020 : Sustainable Collaboration in Business, Technology, Information and Innovation presents Virtual International Conference 
* 5 pages, 2 figures, 5 tables 

  Access Paper or Ask Questions

General Domain Adaptation Through Proportional Progressive Pseudo Labeling

Dec 23, 2020
Mohammad J. Hashemi, Eric Keller

Domain adaptation helps transfer the knowledge gained from a labeled source domain to an unlabeled target domain. During the past few years, different domain adaptation techniques have been published. One common flaw of these approaches is that while they might work well on one input type, such as images, their performance drops when applied to others, such as text or time-series. In this paper, we introduce Proportional Progressive Pseudo Labeling (PPPL), a simple, yet effective technique that can be implemented in a few lines of code to build a more general domain adaptation technique that can be applied on several different input types. At the beginning of the training phase, PPPL progressively reduces target domain classification error, by training the model directly with pseudo-labeled target domain samples, while excluding samples with more likely wrong pseudo-labels from the training set and also postponing training on such samples. Experiments on 6 different datasets that include tasks such as anomaly detection, text sentiment analysis and image classification demonstrate that PPPL can beat other baselines and generalize better.

* Published at 2020 IEEE International Conference on Big Data (Big Data) 

  Access Paper or Ask Questions

"Thy algorithm shalt not bear false witness": An Evaluation of Multiclass Debiasing Methods on Word Embeddings

Nov 04, 2020
Thalea Schlender, Gerasimos Spanakis

With the vast development and employment of artificial intelligence applications, research into the fairness of these algorithms has been increased. Specifically, in the natural language processing domain, it has been shown that social biases persist in word embeddings and are thus in danger of amplifying these biases when used. As an example of social bias, religious biases are shown to persist in word embeddings and the need for its removal is highlighted. This paper investigates the state-of-the-art multiclass debiasing techniques: Hard debiasing, SoftWEAT debiasing and Conceptor debiasing. It evaluates their performance when removing religious bias on a common basis by quantifying bias removal via the Word Embedding Association Test (WEAT), Mean Average Cosine Similarity (MAC) and the Relative Negative Sentiment Bias (RNSB). By investigating the religious bias removal on three widely used word embeddings, namely: Word2Vec, GloVe, and ConceptNet, it is shown that the preferred method is ConceptorDebiasing. Specifically, this technique manages to decrease the measured religious bias on average by 82,42%, 96,78% and 54,76% for the three word embedding sets respectively.

* 15 pages, presented at BNAIC/BENELEARN 2020, data/code at 

  Access Paper or Ask Questions

MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences

Oct 22, 2020
Jianing Yang, Yongxin Wang, Ruitao Yi, Yuying Zhu, Azaan Rehman, Amir Zadeh, Soujanya Poria, Louis-Philippe Morency

Human communication is multimodal in nature; it is through multiple modalities, i.e., language, voice, and facial expressions, that opinions and emotions are expressed. Data in this domain exhibits complex multi-relational and temporal interactions. Learning from this data is a fundamentally challenging research problem. In this paper, we propose Multimodal Temporal Graph Attention Networks (MTGAT). MTGAT is an interpretable graph-based neural model that provides a suitable framework for analyzing this type of multimodal sequential data. We first introduce a procedure to convert unaligned multimodal sequence data into a graph with heterogeneous nodes and edges that captures the rich interactions between different modalities through time. Then, a novel graph operation, called Multimodal Temporal Graph Attention, along with a dynamic pruning and read-out technique is designed to efficiently process this multimodal temporal graph. By learning to focus only on the important interactions within the graph, our MTGAT is able to achieve state-of-the-art performance on multimodal sentiment analysis and emotion recognition benchmarks including IEMOCAP and CMU-MOSI, while utilizing significantly fewer computations.

  Access Paper or Ask Questions

Many-to-one Recurrent Neural Network for Session-based Recommendation

Aug 25, 2020
Amine Dadoun, Raphael Troncy

This paper presents the D2KLab team's approach to the RecSys Challenge 2019 which focuses on the task of recommending accommodations based on user sessions. What is the feeling of a person who says "Rooms of the hotel are enormous, staff are friendly and efficient"? It is positive. Similarly to the sequence of words in a sentence where one can affirm what the feeling is, analysing a sequence of actions performed by a user in a website can lead to predict what will be the item the user will add to his basket at the end of the shopping session. We propose to use a many-to-one recurrent neural network that learns the probability that a user will click on an accommodation based on the sequence of actions he has performed during his browsing session. More specifically, we combine a rule-based algorithm with a Gated Recurrent Unit RNN in order to sort the list of accommodations that is shown to the user. We optimized the RNN on a validation set, tuning the hyper-parameters such as the learning rate, the batch-size and the accommodation embedding size. This analogy with the sentiment analysis task gives promising results. However, it is computationally demanding in the training phase and it needs to be further tuned.

  Access Paper or Ask Questions

Quantification of BERT Diagnosis Generalizability Across Medical Specialties Using Semantic Dataset Distance

Aug 20, 2020
Mihir P. Khambete, William Su, Juan Garcia, Joseph Lehar, Martin Kang, Marcus A. Badgeley

Deep learning models in healthcare may fail to generalize on data from unseen corpora. Additionally, no quantitative metric exists to tell how existing models will perform on new data. Previous studies demonstrated that NLP models of medical notes generalize variably between institutions, but ignored other levels of healthcare organization. We measured SciBERT diagnosis sentiment classifier generalizability between medical specialties using EHR sentences from MIMIC-III. Models trained on one specialty performed better on internal test sets than mixed or external test sets (mean AUCs 0.92, 0.87, and 0.83, respectively; p = 0.016). When models are trained on more specialties, they have better test performances (p < 1e-4). Model performance on new corpora is directly correlated to the similarity between train and test sentence content (p < 1e-4). Future studies should assess additional axes of generalization to ensure deep learning models fulfil their intended purpose across institutions, specialties, and practices.

* 20 pages, 10 figures 

  Access Paper or Ask Questions

FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply

Jun 12, 2020
Lingjiao Chen, Matei Zaharia, James Zou

Prediction APIs offered for a fee are a fast-growing industry and an important part of machine learning as a service. While many such services are available, the heterogeneity in their price and performance makes it challenging for users to decide which API or combination of APIs to use for their own data and budget. We take a first step towards addressing this challenge by proposing FrugalML, a principled framework that jointly learns the strength and weakness of each API on different data, and performs an efficient optimization to automatically identify the best sequential strategy to adaptively use the available APIs within a budget constraint. Our theoretical analysis shows that natural sparsity in the formulation can be leveraged to make FrugalML efficient. We conduct systematic experiments using ML APIs from Google, Microsoft, Amazon, IBM, Baidu and other providers for tasks including facial emotion recognition, sentiment analysis and speech recognition. Across various tasks, FrugalML can achieve up to 90% cost reduction while matching the accuracy of the best single API, or up to 5% better accuracy while matching the best API's cost.

  Access Paper or Ask Questions

FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms

Jun 28, 2019
Henry B. Moss, Andrew Moore, David S. Leslie, Paul Rayson

We present FIESTA, a model selection approach that significantly reduces the computational resources required to reliably identify state-of-the-art performance from large collections of candidate models. Despite being known to produce unreliable comparisons, it is still common practice to compare model evaluations based on single choices of random seeds. We show that reliable model selection also requires evaluations based on multiple train-test splits (contrary to common practice in many shared tasks). Using bandit theory from the statistics literature, we are able to adaptively determine appropriate numbers of data splits and random seeds used to evaluate each model, focusing computational resources on the evaluation of promising models whilst avoiding wasting evaluations on models with lower performance. Furthermore, our user-friendly Python implementation produces confidence guarantees of correctly selecting the optimal model. We evaluate our algorithms by selecting between 8 target-dependent sentiment analysis methods using dramatically fewer model evaluations than current model selection approaches.

* ACL 2019. Code available at: 

  Access Paper or Ask Questions