In recent years, streaming music platforms have become very popular mainly due to the huge number of songs these systems make available to users. This enormous availability means that recommendation mechanisms that help users to select the music they like need to be incorporated. However, developing reliable recommender systems in the music field involves dealing with many problems, some of which are generic and widely studied in the literature, while others are specific to this application domain and are therefore less well-known. This work is focused on two important issues that have not received much attention: managing gray-sheep users and obtaining implicit ratings. The first one is usually addressed by resorting to content information that is often difficult to obtain. The other drawback is related to the sparsity problem that arises when there are obstacles to gather explicit ratings. In this work, the referred shortcomings are addressed by means of a recommendation approach based on the users' streaming sessions. The method is aimed at managing the well-known power-law probability distribution representing the listening behavior of users. This proposal improves the recommendation reliability of collaborative filtering methods while reducing the complexity of the procedures used so far to deal with the gray-sheep problem.
In the case that user profiles are not available, the recommendation based on anonymous session is particularly important, which aims to predict the items that the user may click at the next moment based on the user's access sequence over a while. In recent years, with the development of recurrent neural network, attention mechanism, and graph neural network, the performance of session-based recommendation has been greatly improved. However, the previous methods did not comprehensively consider the context dependencies and short-term interest first of the session. Therefore, we propose a context-aware short-term interest first model (CASIF).The aim of this paper is improve the accuracy of recommendations by combining context and short-term interest. In CASIF, we dynamically construct a graph structure for session sequences and capture rich context dependencies via graph neural network (GNN), latent feature vectors are captured as inputs of the next step. Then we build the short-term interest first module, which can to capture the user's general interest from the session in the context of long-term memory, at the same time get the user's current interest from the item of the last click. In the end, the short-term and long-term interest are combined as the final interest and multiplied by the candidate vector to obtain the recommendation probability. Finally, a large number of experiments on two real-world datasets demonstrate the effectiveness of our proposed method.
Recently, recommender systems that aim to suggest personalized lists of items for users to interact with online have drawn a lot of attention. In fact, many of these state-of-the-art techniques have been deep learning based. Recent studies have shown that these deep learning models (in particular for recommendation systems) are vulnerable to attacks, such as data poisoning, which generates users to promote a selected set of items. However, more recently, defense strategies have been developed to detect these generated users with fake profiles. Thus, advanced injection attacks of creating more `realistic' user profiles to promote a set of items is still a key challenge in the domain of deep learning based recommender systems. In this work, we present our framework CopyAttack, which is a reinforcement learning based black-box attack method that harnesses real users from a source domain by copying their profiles into the target domain with the goal of promoting a subset of items. CopyAttack is constructed to both efficiently and effectively learn policy gradient networks that first select, and then further refine/craft, user profiles from the source domain to ultimately copy into the target domain. CopyAttack's goal is to maximize the hit ratio of the targeted items in the Top-$k$ recommendation list of the users in the target domain. We have conducted experiments on two real-world datasets and have empirically verified the effectiveness of our proposed framework and furthermore performed a thorough model analysis.
Reinforcement learning aims at searching the best policy model for decision making, and has been shown powerful for sequential recommendations. The training of the policy by reinforcement learning, however, is placed in an environment. In many real-world applications, however, the policy training in the real environment can cause an unbearable cost, due to the exploration in the environment. Environment reconstruction from the past data is thus an appealing way to release the power of reinforcement learning in these applications. The reconstruction of the environment is, basically, to extract the casual effect model from the data. However, real-world applications are often too complex to offer fully observable environment information. Therefore, quite possibly there are unobserved confounding variables lying behind the data. The hidden confounder can obstruct an effective reconstruction of the environment. In this paper, by treating the hidden confounder as a hidden policy, we propose a deconfounded multi-agent environment reconstruction (DEMER) approach in order to learn the environment together with the hidden confounder. DEMER adopts a multi-agent generative adversarial imitation learning framework. It proposes to introduce the confounder embedded policy, and use the compatible discriminator for training the policies. We then apply DEMER in an application of driver program recommendation. We firstly use an artificial driver program recommendation environment, abstracted from the real application, to verify and analyze the effectiveness of DEMER. We then test DEMER in the real application of Didi Chuxing. Experiment results show that DEMER can effectively reconstruct the hidden confounder, and thus can build the environment better. DEMER also derives a recommendation policy with a significantly improved performance in the test phase of the real application.
With the information explosion of news articles, personalized news recommendation has become important for users to quickly find news that they are interested in. Existing methods on news recommendation mainly include collaborative filtering methods which rely on direct user-item interactions and content based methods which characterize the content of user reading history. Although these methods have achieved good performances, they still suffer from data sparse problem, since most of them fail to extensively exploit high-order structure information (similar users tend to read similar news articles) in news recommendation systems. In this paper, we propose to build a heterogeneous graph to explicitly model the interactions among users, news and latent topics. The incorporated topic information would help indicate a user's interest and alleviate the sparsity of user-item interactions. Then we take advantage of graph neural networks to learn user and news representations that encode high-order structure information by propagating embeddings over the graph. The learned user embeddings with complete historic user clicks capture the users' long-term interests. We also consider a user's short-term interest using the recent reading history with an attention based LSTM model. Experimental results on real-world datasets show that our proposed model significantly outperforms state-of-the-art methods on news recommendation.
Recent years have witnessed growing interests in multimedia recommendation, which aims to predict whether a user will interact with an item with multimodal contents. Previous studies focus on modeling user-item interactions with multimodal features included as side information. However, this scheme is not well-designed for multimedia recommendation. Firstly, only collaborative item-item relationships are implicitly modeled through high-order item-user-item co-occurrences. We argue that the latent semantic item-item structures underlying these multimodal contents could be beneficial for learning better item representations and assist the recommender models to comprehensively discover candidate items. Secondly, previous studies disregard the fine-grained multimodal fusion. Although having access to multiple modalities might allow us to capture rich information, we argue that the simple coarse-grained fusion by linear combination or concatenation in previous work is insufficient to fully understand content information and item relationships.To this end, we propose a latent structure MIning with ContRastive mOdality fusion method (MICRO for brevity). To be specific, we devise a novel modality-aware structure learning module, which learns item-item relationships for each modality. Based on the learned modality-aware latent item relationships, we perform graph convolutions that explicitly inject item affinities to modality-aware item representations. Then, we design a novel contrastive method to fuse multimodal features. These enriched item representations can be plugged into existing collaborative filtering methods to make more accurate recommendations. Extensive experiments on real-world datasets demonstrate the superiority of our method over state-of-the-art baselines.
Recommender systems (RSs) employ user-item feedback, e.g., ratings, to match customers to personalized lists of products. Approaches to top-k recommendation mainly rely on Learning-To-Rank algorithms and, among them, the most widely adopted is Bayesian Personalized Ranking (BPR), which bases on a pair-wise optimization approach. Recently, BPR has been found vulnerable against adversarial perturbations of its model parameters. Adversarial Personalized Ranking (APR) mitigates this issue by robustifying BPR via an adversarial training procedure. The empirical improvements of APR's accuracy performance on BPR have led to its wide use in several recommender models. However, a key overlooked aspect has been the beyond-accuracy performance of APR, i.e., novelty, coverage, and amplification of popularity bias, considering that recent results suggest that BPR, the building block of APR, is sensitive to the intensification of biases and reduction of recommendation novelty. In this work, we model the learning characteristics of the BPR and APR optimization frameworks to give mathematical evidence that, when the feedback data have a tailed distribution, APR amplifies the popularity bias more than BPR due to an unbalanced number of received positive updates from short-head items. Using matrix factorization (MF), we empirically validate the theoretical results by performing preliminary experiments on two public datasets to compare BPR-MF and APR-MF performance on accuracy and beyond-accuracy metrics. The experimental results consistently show the degradation of novelty and coverage measures and a worrying amplification of bias.
When suggesting Points of Interest (PoIs) to people with autism spectrum disorders, we must take into account that they have idiosyncratic sensory aversions to noise, brightness and other features that influence the way they perceive places. Therefore, recommender systems must deal with these aspects. However, the retrieval of sensory data about PoIs is a real challenge because most geographical information servers fail to provide this data. Moreover, ad-hoc crowdsourcing campaigns do not guarantee to cover large geographical areas and lack sustainability. Thus, we investigate the extraction of sensory data about places from the consumer feedback collected by location-based services, on which people spontaneously post reviews from all over the world. Specifically, we propose a model for the extraction of sensory data from the reviews about PoIs, and its integration in recommender systems to predict item ratings by considering both user preferences and compatibility information. We tested our approach with autistic and neurotypical people by integrating it into diverse recommendation algorithms. For the test, we used a dataset built in a crowdsourcing campaign and another one extracted from TripAdvisor reviews. The results show that the algorithms obtain the highest accuracy and ranking capability when using TripAdvisor data. Moreover, by jointly using these two datasets, the algorithms further improve their performance. These results encourage the use of consumer feedback as a reliable source of information about places in the development of inclusive recommender systems.
Most current recommender systems used the historical behaviour data of user to predict user' preference. However, it is difficult to recommend items to new users accurately. To alleviate this problem, existing user cold start methods either apply deep learning to build a cross-domain recommender system or map user attributes into the space of user behaviour. These methods are more challenging when applied to online travel platform (e.g., Fliggy), because it is hard to find a cross-domain that user has similar behaviour with travel scenarios and the Location Based Services (LBS) information of users have not been paid sufficient attention. In this work, we propose a LBS-based Heterogeneous Relations Model (LHRM) for user cold start recommendation, which utilizes user's LBS information and behaviour information in related domains and user's behaviour information in travel platforms (e.g., Fliggy) to construct the heterogeneous relations between users and items. Moreover, an attention-based multi-layer perceptron is applied to extract latent factors of users and items. Through this way, LHRM has better generalization performance than existing methods. Experimental results on real data from Fliggy's offline log illustrate the effectiveness of LHRM.