Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Causal Decision Making and Causal Effect Estimation Are Not the Same... and Why It Matters

Apr 08, 2021
Carlos Fernández-Loría, Foster Provost

Causal decision making (CDM) at scale has become a routine part of business, and increasingly CDM is based on machine learning algorithms. For example, businesses often target offers, incentives, and recommendations with the goal of affecting consumer behavior. Recently, we have seen an acceleration of research related to CDM and to causal effect estimation (CEE) using machine learned models. This article highlights an important perspective: CDM is not the same as CEE, and counterintuitively, accurate CEE is not necessary for accurate CDM. Our experience is that this is not well understood by practitioners nor by most researchers. Technically, the estimand of interest is different, and this has important implications both for modeling and for the use of statistical models for CDM. We draw on recent research to highlight three of these implications. (1) We should carefully consider the objective function of the causal machine learning, and if possible, we should optimize for accurate "treatment assignment" rather than for accurate effect-size estimation. (2) Confounding does not have the same effect on CDM as it does on CEE. The upshot here is that for supporting CDM it may be just as good to learn with confounded data as with unconfounded data. Finally, (3) causal statistical modeling may not be necessary at all to support CDM, because there may be (and perhaps often is) a proxy target for statistical modeling that can do as well or better. This observation helps to explain at least one broad common CDM practice that seems "wrong" at first blush: the widespread use of non-causal models for targeting interventions. Our perspective is that these observations open up substantial fertile ground for future research. Whether or not you share our perspective completely, we hope we facilitate future research in this area by pointing to related articles from multiple contributing fields.

  
Access Paper or Ask Questions

Synthesizing Credit Card Transactions

Oct 04, 2019
Erik R. Altman

Two elements have been essential to AI's recent boom: (1) deep neural nets and the theory and practice behind them; and (2) cloud computing with its abundant labeled data and large computing resources. Abundant labeled data is available for key domains such as images, speech, natural language processing, and recommendation engines. However, there are many other domains where such data is not available, or access to it is highly restricted for privacy reasons, as with health and financial data. Even when abundant data is available, it is often not labeled. Doing such labeling is labor-intensive and non-scalable. As a result, to the best of our knowledge, key domains still lack labeled data or have at most toy data; or the synthetic data must have access to real data from which it can mimic new data. This paper outlines work to generate realistic synthetic data for an important domain: credit card transactions. Some challenges: there are many patterns and correlations in real purchases. There are millions of merchants and innumerable locations. Those merchants offer a wide variety of goods. Who shops where and when? How much do people pay? What is a realistic fraudulent transaction? We use a mixture of technical approaches and domain knowledge including mechanics of credit card processing, a broad set of consumer domains: electronics, clothing, hair styling, etc. Connecting everything is a virtual world. This paper outlines some of our key techniques and provides evidence that the data generated is indeed realistic. Beyond the scope of this paper: (1) use of our data to develop and train models to predict fraud; (2) coupling models and the synthetic dataset to assess performance in designing accelerators such as GPUs and TPUs.

  
Access Paper or Ask Questions

Qualitative and Quantitative Risk Analysis and Safety Assessment of Unmanned Aerial Vehicles Missions over the Internet

Apr 20, 2019
Azza Allouch, Anis Koubaa, Mohamed Khalgui, Tarek Abbes

In the last few years, Unmanned Aerial Vehicles (UAVs) are making a revolution as an emerging technology with many different applications in the military, civilian, and commercial fields. The advent of autonomous drones has initiated serious challenges, including how to maintain their safe operation during their missions. The safe operation of UAVs remains an open and sensitive issue since any unexpected behavior of the drone or any hazard would lead to potential risks that might be very severe. The motivation behind this work is to propose a methodology for the safety assurance of drones over the Internet {(Internet of drones (IoD))}. Two approaches will be used in performing the safety analysis: (1) a qualitative safety analysis approach, and (2) a quantitative safety analysis approach. The first approach uses the international safety standards, namely ISO 12100 and ISO 13849 to assess the safety of drone's missions by focusing on qualitative assessment techniques. The methodology starts with hazard identification, risk assessment, risk mitigation, and finally, draws the safety recommendations associated with a drone delivery use case. The second approach presents a method for the quantitative safety assessment using Bayesian Networks (BN) for probabilistic modeling. BN utilizes the information provided by the first approach to model the safety risks related to UAVs' flights. An illustrative UAV crash scenario is presented as a case study, followed by scenario analysis, to demonstrate the applicability of the proposed approach. These two analyses, qualitative and quantitative, enable { all involved stakeholders} to detect, explore and address the risks of UAV flights, which will help the industry to better manage the safety concerns of UAVs.

* IEEE Access, April 2019 
* Accepted in IEEE Access, April 2019 
  
Access Paper or Ask Questions

Link Prediction via Higher-Order Motif Features

Feb 08, 2019
Ghadeer Abuoda, Gianmarco De Francisci Morales, Ashraf Aboulnaga

Link prediction requires predicting which new links are likely to appear in a graph. Being able to predict unseen links with good accuracy has important applications in several domains such as social media, security, transportation, and recommendation systems. A common approach is to use features based on the common neighbors of an unconnected pair of nodes to predict whether the pair will form a link in the future. In this paper, we present an approach for link prediction that relies on higher-order analysis of the graph topology, well beyond common neighbors. We treat the link prediction problem as a supervised classification problem, and we propose a set of features that depend on the patterns or motifs that a pair of nodes occurs in. By using motifs of sizes 3, 4, and 5, our approach captures a high level of detail about the graph topology within the neighborhood of the pair of nodes, which leads to a higher classification accuracy. In addition to proposing the use of motif-based features, we also propose two optimizations related to constructing the classification dataset from the graph. First, to ensure that positive and negative examples are treated equally when extracting features, we propose adding the negative examples to the graph as an alternative to the common approach of removing the positive ones. Second, we show that it is important to control for the shortest-path distance when sampling pairs of nodes to form negative examples, since the difficulty of prediction varies with the shortest-path distance. We experimentally demonstrate that using off-the-shelf classifiers with a well constructed classification dataset results in up to 10 percentage points increase in accuracy over prior topology-based and feature learning methods.

  
Access Paper or Ask Questions

Styling with Attention to Details

Jul 03, 2018
Ayushi Dalmia, Sachindra Joshi, Raghavendra Singh, Vikas Raykar

Fashion as characterized by its nature, is driven by style. In this paper, we propose a method that takes into account the style information to complete a given set of selected fashion items with a complementary fashion item. Complementary items are those items that can be worn along with the selected items according to the style. Addressing this problem facilitates in automatically generating stylish fashion ensembles leading to a richer shopping experience for users. Recently, there has been a surge of online social websites where fashion enthusiasts post the outfit of the day and other users can like and comment on them. These posts contain a gold-mine of information about style. In this paper, we exploit these posts to train a deep neural network which captures style in an automated manner. We pose the problem of predicting complementary fashion items as a sequence to sequence problem where the input is the selected set of fashion items and the output is a complementary fashion item based on the style information learned by the model. We use the encoder decoder architecture to solve this problem of completing the set of fashion items. We evaluate the goodness of the proposed model through a variety of experiments. We empirically observe that our proposed model outperforms competitive baseline like apriori algorithm by ~28 in terms of accuracy for top-1 recommendation to complete the fashion ensemble. We also perform retrieval based experiments to understand the ability of the model to learn style and rank the complementary fashion items and find that using attention in our encoder decoder model helps in improving the mean reciprocal rank by ~24. Qualitatively we find the complementary fashion items generated by our proposed model are richer than the apriori algorithm.

  
Access Paper or Ask Questions

Study of Feature Importance for Quantum Machine Learning Models

Feb 18, 2022
Aaron Baughman, Kavitha Yogaraj, Raja Hebbar, Sudeep Ghosh, Rukhsan Ul Haq, Yoshika Chhabra

Predictor importance is a crucial part of data preprocessing pipelines in classical and quantum machine learning (QML). This work presents the first study of its kind in which feature importance for QML models has been explored and contrasted against their classical machine learning (CML) equivalents. We developed a hybrid quantum-classical architecture where QML models are trained and feature importance values are calculated from classical algorithms on a real-world dataset. This architecture has been implemented on ESPN Fantasy Football data using Qiskit statevector simulators and IBM quantum hardware such as the IBMQ Mumbai and IBMQ Montreal systems. Even though we are in the Noisy Intermediate-Scale Quantum (NISQ) era, the physical quantum computing results are promising. To facilitate current quantum scale, we created a data tiering, model aggregation, and novel validation methods. Notably, the feature importance magnitudes from the quantum models had a much higher variation when contrasted to classical models. We can show that equivalent QML and CML models are complementary through diversity measurements. The diversity between QML and CML demonstrates that both approaches can contribute to a solution in different ways. Within this paper we focus on Quantum Support Vector Classifiers (QSVC), Variational Quantum Circuit (VQC), and their classical counterparts. The ESPN and IBM fantasy footballs Trade Assistant combines advanced statistical analysis with the natural language processing of Watson Discovery to serve up personalized trade recommendations that are fair and proposes a trade. Here, player valuation data of each player has been considered and this work can be extended to calculate the feature importance of other QML models such as Quantum Boltzmann machines.

* 21 pages, 15 figures, 1 Table 
  
Access Paper or Ask Questions

It is rotating leaders who build the swarm: social network determinants of growth for healthcare virtual communities of practice

May 26, 2021
G. Antonacci, A. Fronzetti Colladon, A. Stefanini, P. Gloor

Purpose: The purpose of this paper is to identify the factors influencing the growth of healthcare virtual communities of practice (VCoPs) through a seven-year longitudinal study conducted using metrics from social-network and semantic analysis. By studying online communication along the three dimensions of social interactions (connectivity, interactivity and language use), the authors aim to provide VCoP managers with valuable insights to improve the success of their communities. Design/methodology/approach: Communications over a period of seven years (April 2008 to April 2015) and between 14,000 members of 16 different healthcare VCoPs coexisting on the same web platform were analysed. Multilevel regression models were used to reveal the main determinants of community growth over time. Independent variables were derived from social network and semantic analysis measures. Findings: Results show that structural and content-based variables predict the growth of the community. Progressively, more people will join a community if its structure is more centralised, leaders are more dynamic (they rotate more) and the language used in the posts is less complex. Research limitations/implications: The available data set included one Web platform and a limited number of control variables. To consolidate the findings of the present study, the experiment should be replicated on other healthcare VCoPs. Originality/value: The study provides useful recommendations for setting up and nurturing the growth of professional communities, considering, at the same time, the interaction patterns among the community members, the dynamic evolution of these interactions and the use of language. New analytical tools are presented, together with the use of innovative interaction metrics, that can significantly influence community growth, such as rotating leadership.

* Journal of Knowledge Management 21(5), 1218-1239 (2017) 
  
Access Paper or Ask Questions

Representation Learning from Limited Educational Data with Crowdsourced Labels

Sep 23, 2020
Wentao Wang, Guowei Xu, Wenbiao Ding, Gale Yan Huang, Guoliang Li, Jiliang Tang, Zitao Liu

Representation learning has been proven to play an important role in the unprecedented success of machine learning models in numerous tasks, such as machine translation, face recognition and recommendation. The majority of existing representation learning approaches often require a large number of consistent and noise-free labels. However, due to various reasons such as budget constraints and privacy concerns, labels are very limited in many real-world scenarios. Directly applying standard representation learning approaches on small labeled data sets will easily run into over-fitting problems and lead to sub-optimal solutions. Even worse, in some domains such as education, the limited labels are usually annotated by multiple workers with diverse expertise, which yields noises and inconsistency in such crowdsourcing settings. In this paper, we propose a novel framework which aims to learn effective representations from limited data with crowdsourced labels. Specifically, we design a grouping based deep neural network to learn embeddings from a limited number of training samples and present a Bayesian confidence estimator to capture the inconsistency among crowdsourced labels. Furthermore, to expedite the training process, we develop a hard example selection procedure to adaptively pick up training examples that are misclassified by the model. Extensive experiments conducted on three real-world data sets demonstrate the superiority of our framework on learning representations from limited data with crowdsourced labels, comparing with various state-of-the-art baselines. In addition, we provide a comprehensive analysis on each of the main components of our proposed framework and also introduce the promising results it achieved in our real production to fully understand the proposed framework.

* IEEE Transactions on Knowledge and Data Engineering (Accepted) 
  
Access Paper or Ask Questions

Multi-Armed Bandits with Correlated Arms

Dec 03, 2019
Samarth Gupta, Shreyas Chaudhari, Gauri Joshi, Osman Yağan

We consider a multi-armed bandit framework where the rewards obtained by pulling different arms are correlated. The correlation information is captured in terms of \textit{pseudo-rewards}, which are bounds on the rewards on the other arm given a reward realization and can capture many general correlation structures. We leverage these pseudo-rewards to design a novel approach that extends any classical bandit algorithm to the correlated multi-armed bandit setting studied in the framework. In each round, our proposed C-Bandit algorithm identifies some arms as empirically non-competitive, and avoids exploring them for that round. Through a unified regret analysis of the proposed C-Bandit algorithm, we show that C-UCB and C-TS (the correlated bandit versions of Upper-confidence-bound and Thompson sampling) pull certain arms called non-competitive arms, only O(1) times. As a result, we effectively reduce a $K$-armed bandit problem to a $C+1$-armed bandit problem, where $C$ is the number of competitive arms, as only $C$ sub-optimal arms are pulled O(log T) times. In many practical scenarios, $C$ can be zero due to which our proposed C-Bandit algorithms achieve bounded regret. In the special case where rewards are correlated through a latent random variable $X$, we give a regret lower bound that shows that bounded regret is possible only when $C = 0$. In addition to simulations, we validate the proposed algorithms via experiments on two real-world recommendation datasets, movielens and goodreads, and show that C-UCB and C-TS significantly outperform classical bandit algorithms.

* arXiv admin note: text overlap with arXiv:1808.05904 A special case of the model studied in this paper is presented in arXiv:1808.05904 
  
Access Paper or Ask Questions
<<
>>