Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Deconfounded Causal Collaborative Filtering

Oct 14, 2021
Shuyuan Xu, Juntao Tan, Shelby Heinecke, Jia Li, Yongfeng Zhang

Recommender systems may be confounded by various types of confounding factors (also called confounders) that may lead to inaccurate recommendations and sacrificed recommendation performance. Current approaches to solving the problem usually design each specific model for each specific confounder. However, real-world systems may include a huge number of confounders and thus designing each specific model for each specific confounder is unrealistic. More importantly, except for those "explicit confounders" that researchers can manually identify and process such as item's position in the ranking list, there are also many "latent confounders" that are beyond the imagination of researchers. For example, users' rating on a song may depend on their current mood or the current weather, and users' preference on ice creams may depend on the air temperature. Such latent confounders may be unobservable in the recorded training data. To solve the problem, we propose a deconfounded causal collaborative filtering model. We first frame user behaviors with unobserved confounders into a causal graph, and then we design a front-door adjustment model carefully fused with machine learning to deconfound the influence of unobserved confounders. The proposed model is able to handle both global confounders and personalized confounders. Experiments on real-world e-commerce datasets show that our method is able to deconfound unobserved confounders to achieve better recommendation performance.

* 9 pages, 5 figures; comments and suggestions are highly appreciated 

  Access Paper or Ask Questions

Clustered Monotone Transforms for Rating Factorization

Oct 31, 2018
Gaurush Hiranandani, Raghav Somani, Oluwasanmi Koyejo, Sreangsu Acharyya

Exploiting low-rank structure of the user-item rating matrix has been the crux of many recommendation engines. However, existing recommendation engines force raters with heterogeneous behavior profiles to map their intrinsic rating scales to a common rating scale (e.g. 1-5). This non-linear transformation of the rating scale shatters the low-rank structure of the rating matrix, therefore resulting in a poor fit and consequentially, poor recommendations. In this paper, we propose Clustered Monotone Transforms for Rating Factorization (CMTRF), a novel approach to perform regression up to unknown monotonic transforms over unknown population segments. Essentially, for recommendation systems, the technique searches for monotonic transformations of the rating scales resulting in a better fit. This is combined with an underlying matrix factorization regression model that couples the user-wise ratings to exploit shared low dimensional structure. The rating scale transformations can be generated for each user, for a cluster of users, or for all the users at once, forming the basis of three simple and efficient algorithms proposed in this paper, all of which alternate between transformation of the rating scales and matrix factorization regression. Despite the non-convexity, CMTRF is theoretically shown to recover a unique solution under mild conditions. Experimental results on two synthetic and seven real-world datasets show that CMTRF outperforms other state-of-the-art baselines.

* The first two authors contributed equally to the paper. The paper to appear in WSDM 2019 

  Access Paper or Ask Questions

Offline Contextual Multi-armed Bandits for Mobile Health Interventions: A Case Study on Emotion Regulation

Aug 21, 2020
Mawulolo K. Ameko, Miranda L. Beltzer, Lihua Cai, Mehdi Boukhechba, Bethany A. Teachman, Laura E. Barnes

Delivering treatment recommendations via pervasive electronic devices such as mobile phones has the potential to be a viable and scalable treatment medium for long-term health behavior management. But active experimentation of treatment options can be time-consuming, expensive and altogether unethical in some cases. There is a growing interest in methodological approaches that allow an experimenter to learn and evaluate the usefulness of a new treatment strategy before deployment. We present the first development of a treatment recommender system for emotion regulation using real-world historical mobile digital data from n = 114 high socially anxious participants to test the usefulness of new emotion regulation strategies. We explore a number of offline contextual bandits estimators for learning and propose a general framework for learning algorithms. Our experimentation shows that the proposed doubly robust offline learning algorithms performed significantly better than baseline approaches, suggesting that this type of recommender algorithm could improve emotion regulation. Given that emotion regulation is impaired across many mental illnesses and such a recommender algorithm could be scaled up easily, this approach holds potential to increase access to treatment for many people. We also share some insights that allow us to translate contextual bandit models to this complex real-world data, including which contextual features appear to be most important for predicting emotion regulation strategy effectiveness.

* Accepted at RecSys 2020 

  Access Paper or Ask Questions

Risk Aware Ranking for Top-$k$ Recommendations

Apr 12, 2019
Shameem A Puthiya Parambath, Nishant Vijayakumar, Sanjay Chawla

Given an incomplete ratings data over a set of users and items, the preference completion problem aims to estimate a personalized total preference order over a subset of the items. In practical settings, a ranked list of top-$k$ items from the estimated preference order is recommended to the end user in the decreasing order of preference for final consumption. We analyze this model and observe that such a ranking model results in suboptimal performance when the payoff associated with the recommended items is different. We propose a novel and very efficient algorithm for the preference ranking considering the uncertainty regarding the payoffs of the items. Once the preference scores for the users are obtained using any preference learning algorithm, we show that ranking the items using a risk seeking utility function results in the best ranking performance.

  Access Paper or Ask Questions

GAN-based Recommendation with Positive-Unlabeled Sampling

Dec 12, 2020
Yao Zhou, Jianpeng Xu, Jun Wu, Zeinab Taghavi Nasrabadi, Evren Korpeoglu, Kannan Achan, Jingrui He

Recommender systems are popular tools for information retrieval tasks on a large variety of web applications and personalized products. In this work, we propose a Generative Adversarial Network based recommendation framework using a positive-unlabeled sampling strategy. Specifically, we utilize the generator to learn the continuous distribution of user-item tuples and design the discriminator to be a binary classifier that outputs the relevance score between each user and each item. Meanwhile, positive-unlabeled sampling is applied in the learning procedure of the discriminator. Theoretical bounds regarding positive-unlabeled sampling and optimalities of convergence for the discriminators and the generators are provided. We show the effectiveness and efficiency of our framework on three publicly accessible data sets with eight ranking-based evaluation metrics in comparison with thirteen popular baselines.

* 12 pages 

  Access Paper or Ask Questions

Correcting Exposure Bias for Link Recommendation

Jun 13, 2021
Shantanu Gupta, Hao Wang, Zachary C. Lipton, Yuyang Wang

Link prediction methods are frequently applied in recommender systems, e.g., to suggest citations for academic papers or friends in social networks. However, exposure bias can arise when users are systematically underexposed to certain relevant items. For example, in citation networks, authors might be more likely to encounter papers from their own field and thus cite them preferentially. This bias can propagate through naively trained link predictors, leading to both biased evaluation and high generalization error (as assessed by true relevance). Moreover, this bias can be exacerbated by feedback loops. We propose estimators that leverage known exposure probabilities to mitigate this bias and consequent feedback loops. Next, we provide a loss function for learning the exposure probabilities from data. Finally, experiments on semi-synthetic data based on real-world citation networks, show that our methods reliably identify (truly) relevant citations. Additionally, our methods lead to greater diversity in the recommended papers' fields of study. The code is available at

  Access Paper or Ask Questions

SiMCa: Sinkhorn Matrix Factorization with Capacity Constraints

Mar 18, 2022
Eric Daoud, Luca Ganassali, Antoine Baker, Marc Lelarge

For a very broad range of problems, recommendation algorithms have been increasingly used over the past decade. In most of these algorithms, the predictions are built upon user-item affinity scores which are obtained from high-dimensional embeddings of items and users. In more complex scenarios, with geometrical or capacity constraints, prediction based on embeddings may not be sufficient and some additional features should be considered in the design of the algorithm. In this work, we study the recommendation problem in the setting where affinities between users and items are based both on their embeddings in a latent space and on their geographical distance in their underlying euclidean space (e.g., $\mathbb{R}^2$), together with item capacity constraints. This framework is motivated by some real-world applications, for instance in healthcare: the task is to recommend hospitals to patients based on their location, pathology, and hospital capacities. In these applications, there is somewhat of an asymmetry between users and items: items are viewed as static points, their embeddings, capacities and locations constraining the allocation. Upon the observation of an optimal allocation, user embeddings, items capacities, and their positions in their underlying euclidean space, our aim is to recover item embeddings in the latent space; doing so, we are then able to use this estimate e.g. in order to predict future allocations. We propose an algorithm (SiMCa) based on matrix factorization enhanced with optimal transport steps to model user-item affinities and learn item embeddings from observed data. We then illustrate and discuss the results of such an approach for hospital recommendation on synthetic data.

* All comments are welcome 

  Access Paper or Ask Questions

A Deep Learning System for Predicting Size and Fit in Fashion E-Commerce

Jul 23, 2019
Abdul-Saboor Sheikh, Romain Guigoures, Evgenii Koriagin, Yuen King Ho, Reza Shirvany, Roland Vollgraf, Urs Bergmann

Personalized size and fit recommendations bear crucial significance for any fashion e-commerce platform. Predicting the correct fit drives customer satisfaction and benefits the business by reducing costs incurred due to size-related returns. Traditional collaborative filtering algorithms seek to model customer preferences based on their previous orders. A typical challenge for such methods stems from extreme sparsity of customer-article orders. To alleviate this problem, we propose a deep learning based content-collaborative methodology for personalized size and fit recommendation. Our proposed method can ingest arbitrary customer and article data and can model multiple individuals or intents behind a single account. The method optimizes a global set of parameters to learn population-level abstractions of size and fit relevant information from observed customer-article interactions. It further employs customer and article specific embedding variables to learn their properties. Together with learned entity embeddings, the method maps additional customer and article attributes into a latent space to derive personalized recommendations. Application of our method to two publicly available datasets demonstrate an improvement over the state-of-the-art published results. On two proprietary datasets, one containing fit feedback from fashion experts and the other involving customer purchases, we further outperform comparable methodologies, including a recent Bayesian approach for size recommendation.

* Published at the Thirteenth ACM Conference on Recommender Systems (RecSys '19), September 16--20, 2019, Copenhagen, Denmark 

  Access Paper or Ask Questions

Causal Collaborative Filtering

Feb 10, 2021
Shuyuan Xu, Yingqiang Ge, Yunqi Li, Zuohui Fu, Xu Chen, Yongfeng Zhang

Recommender systems are important and valuable tools for many personalized services. Collaborative Filtering (CF) algorithms -- among others -- are fundamental algorithms driving the underlying mechanism of personalized recommendation. Many of the traditional CF algorithms are designed based on the fundamental idea of mining or learning correlative patterns from data for matching, including memory-based methods such as user/item-based CF as well as learning-based methods such as matrix factorization and deep learning models. However, advancing from correlative learning to causal learning is an important problem, because causal/counterfactual modeling can help us to think outside of the observational data for user modeling and personalization. In this paper, we propose Causal Collaborative Filtering (CCF) -- a general framework for modeling causality in collaborative filtering and recommendation. We first provide a unified causal view of CF and mathematically show that many of the traditional CF algorithms are actually special cases of CCF under simplified causal graphs. We then propose a conditional intervention approach for $do$-calculus so that we can estimate the causal relations based on observational data. Finally, we further propose a general counterfactual constrained learning framework for estimating the user-item preferences. Experiments are conducted on two types of real-world datasets -- traditional and randomized trial data -- and results show that our framework can improve the recommendation performance of many CF algorithms.

* 14 pages, 5 figures, 3 tables 

  Access Paper or Ask Questions