So far, most research on recommender systems focused on maintaining long-term user engagement and satisfaction, by promoting relevant and personalized content. However, it is still very challenging to evaluate the quality and the reliability of this content. In this paper, we propose FEBR (Expert-Based Recommendation Framework), an apprenticeship learning framework to assess the quality of the recommended content on online platforms. The framework exploits the demonstrated trajectories of an expert (assumed to be reliable) in a recommendation evaluation environment, to recover an unknown utility function. This function is used to learn an optimal policy describing the expert's behavior, which is then used in the framework to provide high-quality and personalized recommendations. We evaluate the performance of our solution through a user interest simulation environment (using RecSim). We simulate interactions under the aforementioned expert policy for videos recommendation, and compare its efficiency with standard recommendation methods. The results show that our approach provides a significant gain in terms of content quality, evaluated by experts and watched by users, while maintaining almost the same watch time as the baseline approaches.
What we discover and see online, and consequently our opinions and decisions, are becoming increasingly affected by automated machine learned predictions. Similarly, the predictive accuracy of learning machines heavily depends on the feedback data that we provide them. This mutual influence can lead to closed-loop interactions that may cause unknown biases which can be exacerbated after several iterations of machine learning predictions and user feedback. Machine-caused biases risk leading to undesirable social effects ranging from polarization to unfairness and filter bubbles. In this paper, we study the bias inherent in widely used recommendation strategies such as matrix factorization. Then we model the exposure that is borne from the interaction between the user and the recommender system and propose new debiasing strategies for these systems. Finally, we try to mitigate the recommendation system bias by engineering solutions for several state of the art recommender system models. Our results show that recommender systems are biased and depend on the prior exposure of the user. We also show that the studied bias iteratively decreases diversity in the output recommendations. Our debiasing method demonstrates the need for alternative recommendation strategies that take into account the exposure process in order to reduce bias. Our research findings show the importance of understanding the nature of and dealing with bias in machine learning models such as recommender systems that interact directly with humans, and are thus causing an increasing influence on human discovery and decision making
We report the "Recurrent Deterioration" (RD) phenomenon observed in online recommender systems. The RD phenomenon is reflected by the trend of performance degradation when the recommendation model is always trained based on users' feedbacks of the previous recommendations. There are several reasons for the recommender systems to encounter the RD phenomenon, including the lack of negative training data and the evolution of users' interests, etc. Motivated to tackle the problems causing the RD phenomenon, we propose the POMDP-Rec framework, which is a neural-optimized Partially Observable Markov Decision Process algorithm for recommender systems. We show that the POMDP-Rec framework effectively uses the accumulated historical data from real-world recommender systems and automatically achieves comparable results with those models fine-tuned exhaustively by domain exports on public datasets.
Recommender Systems are nowadays successfully used by all major web sites (from e-commerce to social media) to filter content and make suggestions in a personalized way. Academic research largely focuses on the value of recommenders for consumers, e.g., in terms of reduced information overload. To what extent and in which ways recommender systems create business value is, however, much less clear, and the literature on the topic is scattered. In this research commentary, we review existing publications on field tests of recommender systems and report which business-related performance measures were used in such real-world deployments. We summarize common challenges of measuring the business value in practice and critically discuss the value of algorithmic improvements and offline experiments as commonly done in academic environments. Overall, our review indicates that various open questions remain both regarding the realistic quantification of the business effects of recommenders and the performance assessment of recommendation algorithms in academia.
Self-supervised learning (SSL) recently has achieved outstanding success on recommendation. By setting up an auxiliary task (either predictive or contrastive), SSL can discover supervisory signals from the raw data without human annotation, which greatly mitigates the problem of sparse user-item interactions. However, most SSL-based recommendation models rely on general-purpose auxiliary tasks, e.g., maximizing correspondence between node representations learned from the original and perturbed interaction graphs, which are explicitly irrelevant to the recommendation task. Accordingly, the rich semantics reflected by social relationships and item categories, which lie in the recommendation data-based heterogeneous graphs, are not fully exploited. To explore recommendation-specific auxiliary tasks, we first quantitatively analyze the heterogeneous interaction data and find a strong positive correlation between the interactions and the number of user-item paths induced by meta-paths. Based on the finding, we design two auxiliary tasks that are tightly coupled with the target task (one is predictive and the other one is contrastive) towards connecting recommendation with the self-supervision signals hiding in the positive correlation. Finally, a model-agnostic DUal-Auxiliary Learning (DUAL) framework which unifies the SSL and recommendation tasks is developed. The extensive experiments conducted on three real-world datasets demonstrate that DUAL can significantly improve recommendation, reaching the state-of-the-art performance.
Conversational recommender systems (CRSs) have revolutionized the conventional recommendation paradigm by embracing dialogue agents to dynamically capture the fine-grained user preference. In a typical conversational recommendation scenario, a CRS firstly generates questions to let the user clarify her/his demands and then makes suitable recommendations. Hence, the ability to generate suitable clarifying questions is the key to timely tracing users' dynamic preferences and achieving successful recommendations. However, existing CRSs fall short in asking high-quality questions because: (1) system-generated responses heavily depends on the performance of the dialogue policy agent, which has to be trained with huge conversation corpus to cover all circumstances; and (2) current CRSs cannot fully utilize the learned latent user profiles for generating appropriate and personalized responses. To mitigate these issues, we propose the Knowledge-Based Question Generation System (KBQG), a novel framework for conversational recommendation. Distinct from previous conversational recommender systems, KBQG models a user's preference in a finer granularity by identifying the most relevant relations from a structured knowledge graph (KG). Conditioned on the varied importance of different relations, the generated clarifying questions could perform better in impelling users to provide more details on their preferences. Finially, accurate recommendations can be generated in fewer conversational turns. Furthermore, the proposed KBQG outperforms all baselines in our experiments on two real-world datasets.
Most of existing embedding based recommendation models use embeddings (vectors) corresponding to a single fixed point in low-dimensional space, to represent users and items. Such embeddings fail to precisely represent the users/items with uncertainty often observed in recommender systems. Addressing this problem, we propose a unified deep recommendation framework employing Gaussian embeddings, which are proven adaptive to uncertain preferences exhibited by some users, resulting in better user representations and recommendation performance. Furthermore, our framework adopts Monte-Carlo sampling and convolutional neural networks to compute the correlation between the objective user and the candidate item, based on which precise recommendations are achieved. Our extensive experiments on two benchmark datasets not only justify that our proposed Gaussian embeddings capture the uncertainty of users very well, but also demonstrate its superior performance over the state-of-the-art recommendation models.
Graph-based recommender systems (GRSs) analyze the structural information in the graphical representation of data to make better recommendations, especially when the direct user-item relation data is sparse. Ranking-oriented GRSs that form a major class of recommendation systems, mostly use the graphical representation of preference (or rank) data for measuring node similarities, from which they can infer a recommendation list using a neighborhood-based mechanism. In this paper, we propose PGRec, a novel graph-based ranking-oriented recommendation framework. PGRec models the preferences of the users over items, by a novel graph structure called PrefGraph. This graph is then exploited by an improved embedding approach, taking advantage of both factorization and deep learning methods, to extract vectors representing users, items, and preferences. The resulting embedding are then used for predicting users' unknown pairwise preferences from which the final recommendation lists are inferred. We have evaluated the performance of the proposed method against the state of the art model-based and neighborhood-based recommendation methods, and our experiments show that PGRec outperforms the baseline algorithms up to 3.2% in terms of [email protected] in different MovieLens datasets.
At present, most research on the fairness of recommender systems is conducted either from the perspective of customers or from the perspective of product (or service) providers. However, such a practice ignores the fact that when fairness is guaranteed to one side, the fairness and rights of the other side are likely to reduce. In this paper, we consider recommendation scenarios from the perspective of two sides (customers and providers). From the perspective of providers, we consider the fairness of the providers' exposure in recommender system. For customers, we consider the fairness of the reduced quality of recommendation results due to the introduction of fairness measures. We theoretically analyzed the relationship between recommendation quality, customers fairness, and provider fairness, and design a two-sided fairness-aware recommendation model (TFROM) for both customers and providers. Specifically, we design two versions of TFROM for offline and online recommendation. The effectiveness of the model is verified on three real-world data sets. The experimental results show that TFROM provides better two-sided fairness while still maintaining a higher level of personalization than the baseline algorithms.
Recently, real-world recommendation systems need to deal with millions of candidates. It is extremely challenging to conduct sophisticated end-to-end algorithms on the entire corpus due to the tremendous computation costs. Therefore, conventional recommendation systems usually contain two modules. The matching module focuses on the coverage, which aims to efficiently retrieve hundreds of items from large corpora, while the ranking module generates specific ranks for these items. Recommendation diversity is an essential factor that impacts user experience. Most efforts have explored recommendation diversity in ranking, while the matching module should take more responsibility for diversity. In this paper, we propose a novel Heterogeneous graph neural network framework for diversified recommendation (GraphDR) in matching to improve both recommendation accuracy and diversity. Specifically, GraphDR builds a huge heterogeneous preference network to record different types of user preferences, and conduct a field-level heterogeneous graph attention network for node aggregation. We also innovatively conduct a neighbor-similarity based loss to balance both recommendation accuracy and diversity for the diversified matching task. In experiments, we conduct extensive online and offline evaluations on a real-world recommendation system with various accuracy and diversity metrics and achieve significant improvements. We also conduct model analyses and case study for a better understanding of our model. Moreover, GraphDR has been deployed on a well-known recommendation system, which affects millions of users. The source code will be released.