Recommender systems play a key role in shaping modern web ecosystems. These systems alternate between (1) making recommendations (2) collecting user responses to these recommendations, and (3) retraining the recommendation algorithm based on this feedback. During this process the recommender system influences the user behavioral data that is subsequently used to update it, thus creating a feedback loop. Recent work has shown that feedback loops may compromise recommendation quality and homogenize user behavior, raising ethical and performance concerns when deploying recommender systems. To address these issues, we propose the Causal Adjustment for Feedback Loops (CAFL), an algorithm that provably breaks feedback loops using causal inference and can be applied to any recommendation algorithm that optimizes a training loss. Our main observation is that a recommender system does not suffer from feedback loops if it reasons about causal quantities, namely the intervention distributions of recommendations on user ratings. Moreover, we can calculate this intervention distribution from observational data by adjusting for the recommender system's predictions of user preferences. Using simulated environments, we demonstrate that CAFL improves recommendation quality when compared to prior correction methods.
Explainable recommendation has shown its great advantages for improving recommendation persuasiveness, user satisfaction, system transparency, among others. A fundamental problem of explainable recommendation is how to evaluate the explanations. In the past few years, various evaluation strategies have been proposed. However, they are scattered in different papers, and there lacks a systematic and detailed comparison between them. To bridge this gap, in this paper, we comprehensively review the previous work, and provide different taxonomies for them according to the evaluation perspectives and evaluation methods. Beyond summarizing the previous work, we also analyze the (dis)advantages of existing evaluation methods and provide a series of guidelines on how to select them. The contents of this survey are based on more than 100 papers from top-tier conferences like IJCAI, AAAI, TheWebConf, Recsys, UMAP, and IUI, and their complete summarization are presented at https://shimo.im/sheets/VKrpYTcwVH6KXgdy/MODOC/. With this survey, we finally aim to provide a clear and comprehensive review on the evaluation of explainable recommendation.
By providing explanations for users and system designers to facilitate better understanding and decision making, explainable recommendation has been an important research problem. In this paper, we propose Counterfactual Explainable Recommendation (CountER), which takes the insights of counterfactual reasoning from causal inference for explainable recommendation. CountER is able to formulate the complexity and the strength of explanations, and it adopts a counterfactual learning framework to seek simple (low complexity) and effective (high strength) explanations for the model decision. Technically, for each item recommended to each user, CountER formulates a joint optimization problem to generate minimal changes on the item aspects so as to create a counterfactual item, such that the recommendation decision on the counterfactual item is reversed. These altered aspects constitute the explanation of why the original item is recommended. The counterfactual explanation helps both the users for better understanding and the system designers for better model debugging. Another contribution of the work is the evaluation of explainable recommendation, which has been a challenging task. Fortunately, counterfactual explanations are very suitable for standard quantitative evaluation. To measure the explanation quality, we design two types of evaluation metrics, one from user's perspective (i.e. why the user likes the item), and the other from model's perspective (i.e. why the item is recommended by the model). We apply our counterfactual learning algorithm on a black-box recommender system and evaluate the generated explanations on five real-world datasets. Results show that our model generates more accurate and effective explanations than state-of-the-art explainable recommendation models.
Traditional recommendation algorithms develop techniques that can help people to choose desirable items. However, in many real-world applications, along with a set of recommendations, it is also essential to quantify each recommendation's (un)certainty. The conformal recommender system uses the experience of a user to output a set of recommendations, each associated with a precise confidence value. Given a significance level $\varepsilon$, it provides a bound $\varepsilon$ on the probability of making a wrong recommendation. The conformal framework uses a key concept called nonconformity measure that measure the strangeness of an item concerning other items. One of the significant design challenges of any conformal recommendation framework is integrating nonconformity measure with the recommendation algorithm. In this paper, we introduce an inductive variant of a conformal recommender system. We propose and analyze different nonconformity measures in the inductive setting. We also provide theoretical proofs on the error-bound and the time complexity. Extensive empirical analysis on ten benchmark datasets demonstrates that the inductive variant substantially improves the performance in computation time while preserving the accuracy.
Many recommender systems suffer from the popularity bias problem: popular items are being recommended frequently while less popular, niche products, are recommended rarely if not at all. However, those ignored products are exactly the products that businesses need to find customers for and their recommendations would be more beneficial. In this paper, we examine an item weighting approach to improve long-tail recommendation. Our approach works as a simple yet powerful add-on to existing recommendation algorithms for making a tunable trade-off between accuracy and long-tail coverage.
With the prevalence of online social media, users' social connections have been widely studied and utilized to enhance the performance of recommender systems. In this paper, we explore the use of hyperbolic geometry for social recommendation. We present Hyperbolic Social Recommender (HSR), a novel social recommendation framework that utilizes hyperbolic geometry to boost the performance. With the help of hyperbolic spaces, HSR can learn high-quality user and item representations for better modeling user-item interaction and user-user social relations. Via a series of extensive experiments, we show that our proposed HSR outperforms its Euclidean counterpart and state-of-the-art social recommenders in click-through rate prediction and top-K recommendation, demonstrating the effectiveness of social recommendation in the hyperbolic space.
Recommendation systems have become very popular in recent years and are used in various web applications. Modern recommendation systems aim at providing users with personalized recommendations of online products or services. Various recommendation techniques, such as content-based, collaborative filtering-based, knowledge-based, and hybrid-based recommendation systems, have been developed to fulfill the needs in different scenarios. This paper presents a comprehensive review of historical and recent state-of-the-art recommendation approaches, followed by an in-depth analysis of groundbreaking advances in modern recommendation systems based on big data. Furthermore, this paper reviews the issues faced in modern recommendation systems such as sparsity, scalability, and diversity and illustrates how these challenges can be transformed into prolific future research avenues.
Today, Internet is one of the widest available media worldwide. Recommendation systems are increasingly being used in various applications such as movie recommendation, mobile recommendation, article recommendation and etc. Collaborative Filtering (CF) and Content-Based (CB) are Well-known techniques for building recommendation systems. Topic modeling based on LDA, is a powerful technique for semantic mining and perform topic extraction. In the past few years, many articles have been published based on LDA technique for building recommendation systems. In this paper, we present taxonomy of recommendation systems and applications based on LDA. In addition, we utilize LDA and Gibbs sampling algorithms to evaluate ISWC and WWW conference publications in computer science. Our study suggest that the recommendation systems based on LDA could be effective in building smart recommendation system in online communities.
In package recommendations, a set of items is regarded as a unified package towards a single common goal, whereas conventional recommender systems treat items independently. For example, for music playlist recommendations, each package (i.e., playlist) should be consistent with respect to the genres. In group recommendations, items are recommended to a group of users, whereas conventional recommender systems recommend items to an individual user. Different from the conventional settings, it is difficult to measure the utility of group recommendations because it involves more than one user. In particular, fairness is crucial in group recommendations. Even if some members in a group are substantially satisfied with a recommendation, it is undesirable if other members are ignored to increase the total utility. Various methods for evaluating and applying the fairness of group recommendations have been proposed in the literature. However, all these methods maximize the score and output only a single package. This is in contrast to conventional recommender systems, which output several (e.g., top-$K$) candidates. This can be problematic because a group can be dissatisfied with the recommended package owing to some unobserved reasons, even if the score is high. In particular, each fairness measure is not absolute, and users may call for different fairness criteria than the one adopted in the recommender system in operation. To address this issue, we propose a method to enumerate fair packages so that a group can select their favorite packages from the list. Our proposed method can enumerate fair packages efficiently, and users can search their favorite packages by various filtering queries. We confirm that our algorithm scales to large datasets and can balance several aspects of the utility of the packages.