A recommender system is an information filtering technology which can be used to predict preference ratings of items (products, services, movies, etc) and/or to output a ranking of items that are likely to be of interest to the user. Context-aware recommender systems (CARS) learn and predict the tastes and preferences of users by incorporating available contextual information in the recommendation process. One of the major challenges in context-aware recommender systems research is the lack of automatic methods to obtain contextual information for these systems. Considering this scenario, in this paper, we propose to use contextual information from topic hierarchies of the items (web pages) to improve the performance of context-aware recommender systems. The topic hierarchies are constructed by an extension of the LUPI-based Incremental Hierarchical Clustering method that considers three types of information: traditional bag-of-words (technical information), and the combination of named entities (privileged information I) with domain terms (privileged information II). We evaluated the contextual information in four context-aware recommender systems. Different weights were assigned to each type of information. The empirical results demonstrated that topic hierarchies with the combination of the two kinds of privileged information can provide better recommendations.
Considering the rapidly increasing number of academic papers, searching for and citing appropriate references has become a non-trial task during the wiring of papers. Recommending a handful of candidate papers to a manuscript before publication could ease the burden of the authors, and help the reviewers to check the completeness of the cited resources. Conventional approaches on citation recommendation generally consider recommending one ground-truth citation for a query context from an input manuscript, but lack of consideration on co-citation recommendations. However, a piece of context often needs to be supported by two or more co-citation pairs. Here, we propose a novel scientific paper modeling for citation recommendations, namely Multi-Positive BERT Model for Citation Recommendation (MP-BERT4CR), complied with a series of Multi-Positive Triplet objectives to recommend multiple positive citations for a query context. The proposed approach has the following advantages: First, the proposed multi-positive objectives are effective to recommend multiple positive candidates. Second, we adopt noise distributions which are built based on the historical co-citation frequencies, so that MP-BERT4CR is not only effective on recommending high-frequent co-citation pairs; but also the performances on retrieving the low-frequent ones are significantly improved. Third, we propose a dynamic context sampling strategy which captures the ``macro-scoped'' citing intents from a manuscript and empowers the citation embeddings to be content-dependent, which allow the algorithm to further improve the performances. Single and multiple positive recommendation experiments testified that MP-BERT4CR delivered significant improvements. In addition, MP-BERT4CR are also effective in retrieving the full list of co-citations, and historically low-frequent co-citation pairs compared with the prior works.
The music domain is among the most important ones for adopting recommender systems technology. In contrast to most other recommendation domains, which predominantly rely on collaborative filtering (CF) techniques, music recommenders have traditionally embraced content-based (CB) approaches. In the past years, music recommendation models that leverage collaborative and content data -- which we refer to as content-driven models -- have been replacing pure CF or CB models. In this survey, we review 47 articles on content-driven music recommendation. Based on a thorough literature analysis, we first propose an onion model comprising five layers, each of which corresponds to a category of music content we identified: signal, embedded metadata, expert-generated content, user-generated content, and derivative content. We provide a detailed characterization of each category along several dimensions. Second, we identify six overarching challenges, according to which we organize our main discussion: increasing recommendation diversity and novelty, providing transparency and explanations, accomplishing context-awareness, recommending sequences of music, improving scalability and efficiency, and alleviating cold start. Each article addressing one or more of these challenges is categorized according to the content layers of our onion model, the article's goal(s), and main methodological choices. Furthermore, articles are discussed in temporal order to shed light on the evolution of content-driven music recommendation strategies. Finally, we provide our personal selection of the persisting grand challenges, which are still waiting to be solved in future research endeavors.
Research has shown that recommender systems are typically biased towards popular items, which leads to less popular items being underrepresented in recommendations. The recent work of Abdollahpouri et al. in the context of movie recommendations has shown that this popularity bias leads to unfair treatment of both long-tail items as well as users with little interest in popular items. In this paper, we reproduce the analyses of Abdollahpouri et al. in the context of music recommendation. Specifically, we investigate three user groups from the LastFM music platform that are categorized based on how much their listening preferences deviate from the most popular music among all LastFM users in the dataset: (i) low-mainstream users, (ii) medium-mainstream users, and (iii) high-mainstream users. In line with Abdollahpouri et al., we find that state-of-the-art recommendation algorithms favor popular items also in the music domain. However, their proposed Group Average Popularity metric yields different results for LastFM than for the movie domain, presumably due to the larger number of available items (i.e., music artists) in the LastFM dataset we use. Finally, we compare the accuracy results of the recommendation algorithms for the three user groups and find that the low-mainstreaminess group significantly receives the worst recommendations.
Item recommendation is a personalized ranking task. To this end, many recommender systems optimize models with pairwise ranking objectives, such as the Bayesian Personalized Ranking (BPR). Using matrix Factorization (MF) --- the most widely used model in recommendation --- as a demonstration, we show that optimizing it with BPR leads to a recommender model that is not robust. In particular, we find that the resultant model is highly vulnerable to adversarial perturbations on its model parameters, which implies the possibly large error in generalization. To enhance the robustness of a recommender model and thus improve its generalization performance, we propose a new optimization framework, namely Adversarial Personalized Ranking (APR). In short, our APR enhances the pairwise ranking method BPR by performing adversarial training. It can be interpreted as playing a minimax game, where the minimization of the BPR objective function meanwhile defends an adversary, which adds adversarial perturbations on model parameters to maximize the BPR objective function. To illustrate how it works, we implement APR on MF by adding adversarial perturbations on the embedding vectors of users and items. Extensive experiments on three public real-world datasets demonstrate the effectiveness of APR --- by optimizing MF with APR, it outperforms BPR with a relative improvement of 11.2% on average and achieves state-of-the-art performance for item recommendation. Our implementation is available at: https://github.com/hexiangnan/adversarial_personalized_ranking.
Exposure bias is a well-known issue in recommender systems where items and suppliers are not equally represented in the recommendation results. This is especially problematic when bias is amplified over time as a few popular items are repeatedly over-represented in recommendation lists. This phenomenon can be viewed as a recommendation feedback loop: the system repeatedly recommends certain items at different time points and interactions of users with those items will amplify bias towards those items over time. This issue has been extensively studied in the literature on model-based or neighborhood-based recommendation algorithms, but less work has been done on online recommendation models such as those based on multi-armed Bandit algorithms. In this paper, we study exposure bias in a class of well-known bandit algorithms known as Linear Cascade Bandits. We analyze these algorithms on their ability to handle exposure bias and provide a fair representation for items and suppliers in the recommendation results. Our analysis reveals that these algorithms fail to treat items and suppliers fairly and do not sufficiently explore the item space for each user. To mitigate this bias, we propose a discounting factor and incorporate it into these algorithms that controls the exposure of items at each time step. To show the effectiveness of the proposed discounting factor on mitigating exposure bias, we perform experiments on two datasets using three cascading bandit algorithms and our experimental results show that the proposed method improves the exposure fairness for items and suppliers.
Recommender systems are tools that support online users by pointing them to potential items of interest in situations of information overload. In recent years, the class of session-based recommendation algorithms received more attention in the research literature. These algorithms base their recommendations solely on the observed interactions with the user in an ongoing session and do not require the existence of long-term preference profiles. Most recently, a number of deep learning based ("neural") approaches to session-based recommendations were proposed. However, previous research indicates that today's complex neural recommendation methods are not always better than comparably simple algorithms in terms of prediction accuracy. With this work, our goal is to shed light on the state-of-the-art in the area of session-based recommendation and on the progress that is made with neural approaches. For this purpose, we compare twelve algorithmic approaches, among them six recent neural methods, under identical conditions on various datasets. We find that the progress in terms of prediction accuracy that is achieved with neural methods is still limited. In most cases, our experiments show that simple heuristic methods based on nearest-neighbors schemes are preferable over conceptually and computationally more complex methods. Observations from a user study furthermore indicate that recommendations based on heuristic methods were also well accepted by the study participants. To support future progress and reproducibility in this area, we publicly share the session-rec evaluation framework that was used in our research.
In Recommender systems, data representation techniques play a great role as they have the power to entangle, hide and reveal explanatory factors embedded within datasets. Hence, they influence the quality of recommendations. Specifically, in Visual Art (VA) recommendations the complexity of the concepts embodied within paintings, makes the task of capturing semantics by machines far from trivial. In VA recommendation, prominent works commonly use manually curated metadata to drive recommendations. Recent works in this domain aim at leveraging visual features extracted using Deep Neural Networks (DNN). However, such data representation approaches are resource demanding and do not have a direct interpretation, hindering user acceptance. To address these limitations, we introduce an approach for Personalised Recommendation of Visual arts based on learning latent semantic representation of paintings. Specifically, we trained a Latent Dirichlet Allocation (LDA) model on textual descriptions of paintings. Our LDA model manages to successfully uncover non-obvious semantic relationships between paintings whilst being able to offer explainable recommendations. Experimental evaluations demonstrate that our method tends to perform better than exploiting visual features extracted using pre-trained Deep Neural Networks.
Medication recommendation is a crucial task for intelligent healthcare systems. Previous studies mainly recommend medications with electronic health records(EHRs). However, some details of interactions between doctors and patients may be ignored in EHRs, which are essential for automatic medication recommendation. Therefore, we make the first attempt to recommend medications with the conversations between doctors and patients. In this work, we construct DialMed, the first high-quality dataset for medical dialogue-based medication recommendation task. It contains 11,996 medical dialogues related to 16 common diseases from 3 departments and 70 corresponding common medications. Furthermore, we propose a Dialogue structure and Disease knowledge aware Network(DDN), where a graph attention network is utilized to model the dialogue structure and the knowledge graph is used to introduce external disease knowledge. The extensive experimental results demonstrate that the proposed method is a promising solution to recommend medications with medical dialogues. The dataset and code are available at https://github.com/Hhhhhhhzf/DialMed.
Well acquisition in the oil and gas industry can often be a hit or miss process, with a poor purchase resulting in substantial loss. Recommender systems suggest items (wells) that users (companies) are likely to buy based on past activity, and applying this system to well acquisition can increase company profits. While traditional recommender systems are impactful enough on their own, they are not optimized. This is because they ignore many of the complexities involved in human decision-making, and frequently make subpar recommendations. Using a preexisting Python implementation of a Factorization Machine results in more accurate recommendations based on a user-level ranking system. We train a Factorization Machine model on oil and gas well data that includes features such as elevation, total depth, and location. The model produces recommendations by using similarities between companies and wells, as well as their interactions. Our model has a hit rate of 0.680, reciprocal rank of 0.469, precision of 0.229, and recall of 0.463. These metrics imply that while our model is able to recommend the correct wells in a general sense, it does not match exact wells to companies via relevance. To improve the model's accuracy, future models should incorporate additional features such as the well's production data and ownership duration as these features will produce more accurate recommendations.