In this paper, we present our work to support publishers and editors in finding descriptive tags for e-books through tag recommendations. We propose a hybrid tag recommendation system for e-books, which leverages search query terms from Amazon users and e-book metadata, which is assigned by publishers and editors. Our idea is to mimic the vocabulary of users in Amazon, who search for and review e-books, and to combine these search terms with editor tags in a hybrid tag recommendation approach. In total, we evaluate 19 tag recommendation algorithms on the review content of Amazon users, which reflects the readers' vocabulary. Our results show that we can improve the performance of tag recommender systems for e-books both concerning tag recommendation accuracy, diversity as well as a novel semantic similarity metric, which we also propose in this paper.
Conversational recommender systems (CRS) enable the traditional recommender systems to explicitly acquire user preferences towards items and attributes through interactive conversations. Reinforcement learning (RL) is widely adopted to learn conversational recommendation policies to decide what attributes to ask, which items to recommend, and when to ask or recommend, at each conversation turn. However, existing methods mainly target at solving one or two of these three decision-making problems in CRS with separated conversation and recommendation components, which restrict the scalability and generality of CRS and fall short of preserving a stable training procedure. In the light of these challenges, we propose to formulate these three decision-making problems in CRS as a unified policy learning task. In order to systematically integrate conversation and recommendation components, we develop a dynamic weighted graph based RL method to learn a policy to select the action at each conversation turn, either asking an attribute or recommending items. Further, to deal with the sample efficiency issue, we propose two action selection strategies for reducing the candidate action space according to the preference and entropy information. Experimental results on two benchmark CRS datasets and a real-world E-Commerce application show that the proposed method not only significantly outperforms state-of-the-art methods but also enhances the scalability and stability of CRS.
The key of sequential recommendation lies in the accurate item correlation modeling. Previous models infer such information based on item co-occurrences, which may fail to capture the real causal relations, and impact the recommendation performance and explainability. In this paper, we equip sequential recommendation with a novel causal discovery module to capture causalities among user behaviors. Our general idea is firstly assuming a causal graph underlying item correlations, and then we learn the causal graph jointly with the sequential recommender model by fitting the real user behavior data. More specifically, in order to satisfy the causality requirement, the causal graph is regularized by a differentiable directed acyclic constraint. Considering that the number of items in recommender systems can be very large, we represent different items with a unified set of latent clusters, and the causal graph is defined on the cluster level, which enhances the model scalability and robustness. In addition, we provide theoretical analysis on the identifiability of the learned causal graph. To the best of our knowledge, this paper makes a first step towards combining sequential recommendation with causal discovery. For evaluating the recommendation performance, we implement our framework with different neural sequential architectures, and compare them with many state-of-the-art methods based on real-world datasets. Empirical studies manifest that our model can on average improve the performance by about 7% and 11% on f1 and NDCG, respectively. To evaluate the model explainability, we build a new dataset with human labeled explanations for both quantitative and qualitative analysis.
Nowadays, people start to use online reservation systems to plan their vacations since they have vast amount of choices available. Selecting when and where to go from this large-scale options is getting harder. In addition, sometimes consumers can miss the better options due to the wealth of information to be found on the online reservation systems. In this sense, personalized services such as recommender systems play a crucial role in decision making. Two traditional recommendation techniques are content-based and collaborative filtering. While both methods have their advantages, they also have certain disadvantages, some of which can be solved by combining both techniques to improve the quality of the recommendation. The resulting system is known as a hybrid recommender system. This paper presents a new hybrid hotel recommendation system that has been developed by combining content-based and collaborative filtering approaches that recommends customer the hotel they need and save them from time loss.
We investigate whether model extraction can be used to "steal" the weights of sequential recommender systems, and the potential threats posed to victims of such attacks. This type of risk has attracted attention in image and text classification, but to our knowledge not in recommender systems. We argue that sequential recommender systems are subject to unique vulnerabilities due to the specific autoregressive regimes used to train them. Unlike many existing recommender attackers, which assume the dataset used to train the victim model is exposed to attackers, we consider a data-free setting, where training data are not accessible. Under this setting, we propose an API-based model extraction method via limited-budget synthetic data generation and knowledge distillation. We investigate state-of-the-art models for sequential recommendation and show their vulnerability under model extraction and downstream attacks. We perform attacks in two stages. (1) Model extraction: given different types of synthetic data and their labels retrieved from a black-box recommender, we extract the black-box model to a white-box model via distillation. (2) Downstream attacks: we attack the black-box model with adversarial samples generated by the white-box recommender. Experiments show the effectiveness of our data-free model extraction and downstream attacks on sequential recommenders in both profile pollution and data poisoning settings.
Today's research in recommender systems is largely based on experimental designs that are static in a sense that they do not consider potential longitudinal effects of providing recommendations to users. In reality, however, various important and interesting phenomena only emerge or become visible over time, e.g., when a recommender system continuously reinforces the popularity of already successful artists on a music streaming site or when recommendations that aim at profit maximization lead to a loss of consumer trust in the long run. In this paper, we discuss how Agent-Based Modeling and Simulation (ABM) techniques can be used to study such important longitudinal dynamics of recommender systems. To that purpose, we provide an overview of the ABM principles, outline a simulation framework for recommender systems based on the literature, and discuss various practical research questions that can be addressed with such an ABM-based simulation framework.
One of the most challenging recommendation tasks is recommending to a new, previously unseen user. This is known as the 'user cold start' problem. Assuming certain features or attributes of users are known, one approach for handling new users is to initially model them based on their features. Motivated by an ad targeting application, this paper describes an extreme online recommendation setting where the cold start problem is perpetual. Every user is encountered by the system just once, receives a recommendation, and either consumes or ignores it, registering a binary reward. We introduce One-pass Factorization of Feature Sets, OFF-Set, a novel recommendation algorithm based on Latent Factor analysis, which models users by mapping their features to a latent space. Furthermore, OFF-Set is able to model non-linear interactions between pairs of features. OFF-Set is designed for purely online recommendation, performing lightweight updates of its model per each recommendation-reward observation. We evaluate OFF-Set against several state of the art baselines, and demonstrate its superiority on real ad-targeting data.
News recommendation is critical for personalized news distribution. Federated news recommendation enables collaborative model learning from many clients without sharing their raw data. It is promising for privacy-preserving news recommendation. However, the security of federated news recommendation is still unclear. In this paper, we study this problem by proposing an untargeted attack called UA-FedRec. By exploiting the prior knowledge of news recommendation and federated learning, UA-FedRec can effectively degrade the model performance with a small percentage of malicious clients. First, the effectiveness of news recommendation highly depends on user modeling and news modeling. We design a news similarity perturbation method to make representations of similar news farther and those of dissimilar news closer to interrupt news modeling, and propose a user model perturbation method to make malicious user updates in opposite directions of benign updates to interrupt user modeling. Second, updates from different clients are typically aggregated by weighted-averaging based on their sample sizes. We propose a quantity perturbation method to enlarge sample sizes of malicious clients in a reasonable range to amplify the impact of malicious updates. Extensive experiments on two real-world datasets show that UA-FedRec can effectively degrade the accuracy of existing federated news recommendation methods, even when defense is applied. Our study reveals a critical security issue in existing federated news recommendation systems and calls for research efforts to address the issue.
Top-N recommendation, which aims to learn user ranking-based preference, has long been a fundamental problem in a wide range of applications. Traditional models usually motivate themselves by designing complex or tailored architectures based on different assumptions. However, the training data of recommender system can be extremely sparse and imbalanced, which poses great challenges for boosting the recommendation performance. To alleviate this problem, in this paper, we propose to reformulate the recommendation task within the causal inference framework, which enables us to counterfactually simulate user ranking-based preferences to handle the data scarce problem. The core of our model lies in the counterfactual question: "what would be the user's decision if the recommended items had been different?". To answer this question, we firstly formulate the recommendation process with a series of structural equation models (SEMs), whose parameters are optimized based on the observed data. Then, we actively indicate many recommendation lists (called intervention in the causal inference terminology) which are not recorded in the dataset, and simulate user feedback according to the learned SEMs for generating new training samples. Instead of randomly intervening on the recommendation list, we design a learning-based method to discover more informative training samples. Considering that the learned SEMs can be not perfect, we, at last, theoretically analyze the relation between the number of generated samples and the model prediction error, based on which a heuristic method is designed to control the negative effect brought by the prediction error. Extensive experiments are conducted based on both synthetic and real-world datasets to demonstrate the effectiveness of our framework.