South Indian classical music (Carnatic music) is best consumed through live concerts. A carnatic recital requires meticulous planning accounting for several parameters like the performers' repertoire, composition variety, musical versatility, thematic structure, the recital's arrangement, etc. to ensure that the audience have a comprehensive listening experience. In this work, we present ragamAI a novel machine learning framework that utilizes the tonic nuances and musical structures in the carnatic music to generate a concert recital that melodically captures the entire range in an octave. Utilizing the underlying idea of playlist and session-based recommender models, the proposed model studies the mathematical structure present in past concerts and recommends relevant items for the playlist/concert. ragamAI ensembles recommendations given by multiple models to learn user idea and past preference of sequences in concerts to extract recommendations. Our experiments on a vast collection of concert show that our model performs 25%-50% better than baseline models. ragamAI's applications are two-fold. 1) it will assist musicians to customize their performance with the necessary variety required to sustain the interest of the audience for the entirety of the concert 2) it will generate carefully curated lists of south Indian classical music so that the listener can discover the wide range of melody that the musical system can offer.
Collaborative Filtering (CF) is one of the most successful approaches for recommender systems. With the emergence of online social networks, social recommendation has become a popular research direction. Most of these social recommendation models utilized each user's local neighbors' preferences to alleviate the data sparsity issue in CF. However, they only considered the local neighbors of each user and neglected the process that users' preferences are influenced as information diffuses in the social network. Recently, Graph Convolutional Networks~(GCN) have shown promising results by modeling the information diffusion process in graphs that leverage both graph structure and node feature information. To this end, in this paper, we propose an effective graph convolutional neural network based model for social recommendation. Based on a classical CF model, the key idea of our proposed model is that we borrow the strengths of GCNs to capture how users' preferences are influenced by the social diffusion process in social networks. The diffusion of users' preferences is built on a layer-wise diffusion manner, with the initial user embedding as a function of the current user's features and a free base user latent vector that is not contained in the user feature. Similarly, each item's latent vector is also a combination of the item's free latent vector, as well as its feature representation. Furthermore, we show that our proposed model is flexible when user and item features are not available. Finally, extensive experimental results on two real-world datasets clearly show the effectiveness of our proposed model.
In recent years, streaming music platforms have become very popular mainly due to the huge number of songs these systems make available to users. This enormous availability means that recommendation mechanisms that help users to select the music they like need to be incorporated. However, developing reliable recommender systems in the music field involves dealing with many problems, some of which are generic and widely studied in the literature, while others are specific to this application domain and are therefore less well-known. This work is focused on two important issues that have not received much attention: managing gray-sheep users and obtaining implicit ratings. The first one is usually addressed by resorting to content information that is often difficult to obtain. The other drawback is related to the sparsity problem that arises when there are obstacles to gather explicit ratings. In this work, the referred shortcomings are addressed by means of a recommendation approach based on the users' streaming sessions. The method is aimed at managing the well-known power-law probability distribution representing the listening behavior of users. This proposal improves the recommendation reliability of collaborative filtering methods while reducing the complexity of the procedures used so far to deal with the gray-sheep problem.
In the case that user profiles are not available, the recommendation based on anonymous session is particularly important, which aims to predict the items that the user may click at the next moment based on the user's access sequence over a while. In recent years, with the development of recurrent neural network, attention mechanism, and graph neural network, the performance of session-based recommendation has been greatly improved. However, the previous methods did not comprehensively consider the context dependencies and short-term interest first of the session. Therefore, we propose a context-aware short-term interest first model (CASIF).The aim of this paper is improve the accuracy of recommendations by combining context and short-term interest. In CASIF, we dynamically construct a graph structure for session sequences and capture rich context dependencies via graph neural network (GNN), latent feature vectors are captured as inputs of the next step. Then we build the short-term interest first module, which can to capture the user's general interest from the session in the context of long-term memory, at the same time get the user's current interest from the item of the last click. In the end, the short-term and long-term interest are combined as the final interest and multiplied by the candidate vector to obtain the recommendation probability. Finally, a large number of experiments on two real-world datasets demonstrate the effectiveness of our proposed method.
Recently, recommender systems that aim to suggest personalized lists of items for users to interact with online have drawn a lot of attention. In fact, many of these state-of-the-art techniques have been deep learning based. Recent studies have shown that these deep learning models (in particular for recommendation systems) are vulnerable to attacks, such as data poisoning, which generates users to promote a selected set of items. However, more recently, defense strategies have been developed to detect these generated users with fake profiles. Thus, advanced injection attacks of creating more `realistic' user profiles to promote a set of items is still a key challenge in the domain of deep learning based recommender systems. In this work, we present our framework CopyAttack, which is a reinforcement learning based black-box attack method that harnesses real users from a source domain by copying their profiles into the target domain with the goal of promoting a subset of items. CopyAttack is constructed to both efficiently and effectively learn policy gradient networks that first select, and then further refine/craft, user profiles from the source domain to ultimately copy into the target domain. CopyAttack's goal is to maximize the hit ratio of the targeted items in the Top-$k$ recommendation list of the users in the target domain. We have conducted experiments on two real-world datasets and have empirically verified the effectiveness of our proposed framework and furthermore performed a thorough model analysis.
Reinforcement learning aims at searching the best policy model for decision making, and has been shown powerful for sequential recommendations. The training of the policy by reinforcement learning, however, is placed in an environment. In many real-world applications, however, the policy training in the real environment can cause an unbearable cost, due to the exploration in the environment. Environment reconstruction from the past data is thus an appealing way to release the power of reinforcement learning in these applications. The reconstruction of the environment is, basically, to extract the casual effect model from the data. However, real-world applications are often too complex to offer fully observable environment information. Therefore, quite possibly there are unobserved confounding variables lying behind the data. The hidden confounder can obstruct an effective reconstruction of the environment. In this paper, by treating the hidden confounder as a hidden policy, we propose a deconfounded multi-agent environment reconstruction (DEMER) approach in order to learn the environment together with the hidden confounder. DEMER adopts a multi-agent generative adversarial imitation learning framework. It proposes to introduce the confounder embedded policy, and use the compatible discriminator for training the policies. We then apply DEMER in an application of driver program recommendation. We firstly use an artificial driver program recommendation environment, abstracted from the real application, to verify and analyze the effectiveness of DEMER. We then test DEMER in the real application of Didi Chuxing. Experiment results show that DEMER can effectively reconstruct the hidden confounder, and thus can build the environment better. DEMER also derives a recommendation policy with a significantly improved performance in the test phase of the real application.
With the information explosion of news articles, personalized news recommendation has become important for users to quickly find news that they are interested in. Existing methods on news recommendation mainly include collaborative filtering methods which rely on direct user-item interactions and content based methods which characterize the content of user reading history. Although these methods have achieved good performances, they still suffer from data sparse problem, since most of them fail to extensively exploit high-order structure information (similar users tend to read similar news articles) in news recommendation systems. In this paper, we propose to build a heterogeneous graph to explicitly model the interactions among users, news and latent topics. The incorporated topic information would help indicate a user's interest and alleviate the sparsity of user-item interactions. Then we take advantage of graph neural networks to learn user and news representations that encode high-order structure information by propagating embeddings over the graph. The learned user embeddings with complete historic user clicks capture the users' long-term interests. We also consider a user's short-term interest using the recent reading history with an attention based LSTM model. Experimental results on real-world datasets show that our proposed model significantly outperforms state-of-the-art methods on news recommendation.
Recent years have witnessed growing interests in multimedia recommendation, which aims to predict whether a user will interact with an item with multimodal contents. Previous studies focus on modeling user-item interactions with multimodal features included as side information. However, this scheme is not well-designed for multimedia recommendation. Firstly, only collaborative item-item relationships are implicitly modeled through high-order item-user-item co-occurrences. We argue that the latent semantic item-item structures underlying these multimodal contents could be beneficial for learning better item representations and assist the recommender models to comprehensively discover candidate items. Secondly, previous studies disregard the fine-grained multimodal fusion. Although having access to multiple modalities might allow us to capture rich information, we argue that the simple coarse-grained fusion by linear combination or concatenation in previous work is insufficient to fully understand content information and item relationships.To this end, we propose a latent structure MIning with ContRastive mOdality fusion method (MICRO for brevity). To be specific, we devise a novel modality-aware structure learning module, which learns item-item relationships for each modality. Based on the learned modality-aware latent item relationships, we perform graph convolutions that explicitly inject item affinities to modality-aware item representations. Then, we design a novel contrastive method to fuse multimodal features. These enriched item representations can be plugged into existing collaborative filtering methods to make more accurate recommendations. Extensive experiments on real-world datasets demonstrate the superiority of our method over state-of-the-art baselines.
Recommender systems (RSs) employ user-item feedback, e.g., ratings, to match customers to personalized lists of products. Approaches to top-k recommendation mainly rely on Learning-To-Rank algorithms and, among them, the most widely adopted is Bayesian Personalized Ranking (BPR), which bases on a pair-wise optimization approach. Recently, BPR has been found vulnerable against adversarial perturbations of its model parameters. Adversarial Personalized Ranking (APR) mitigates this issue by robustifying BPR via an adversarial training procedure. The empirical improvements of APR's accuracy performance on BPR have led to its wide use in several recommender models. However, a key overlooked aspect has been the beyond-accuracy performance of APR, i.e., novelty, coverage, and amplification of popularity bias, considering that recent results suggest that BPR, the building block of APR, is sensitive to the intensification of biases and reduction of recommendation novelty. In this work, we model the learning characteristics of the BPR and APR optimization frameworks to give mathematical evidence that, when the feedback data have a tailed distribution, APR amplifies the popularity bias more than BPR due to an unbalanced number of received positive updates from short-head items. Using matrix factorization (MF), we empirically validate the theoretical results by performing preliminary experiments on two public datasets to compare BPR-MF and APR-MF performance on accuracy and beyond-accuracy metrics. The experimental results consistently show the degradation of novelty and coverage measures and a worrying amplification of bias.