Conversational recommender systems (CRS) have shown great success in accurately capturing a user's current and detailed preference through the multi-round interaction cycle while effectively guiding users to a more personalized recommendation. Perhaps surprisingly, conversational recommender systems can be plagued by popularity bias, much like traditional recommender systems. In this paper, we systematically study the problem of popularity bias in CRSs. We demonstrate the existence of popularity bias in existing state-of-the-art CRSs from an exposure rate, a success rate, and a conversational utility perspective, and propose a suite of popularity bias metrics designed specifically for the CRS setting. We then introduce a debiasing framework with three unique features: (i) Popularity-Aware Focused Learning to reduce the popularity-distorting impact on preference prediction; (ii) Cold-Start Item Embedding Reconstruction via Attribute Mapping, to improve the modeling of cold-start items; and (iii) Dual-Policy Learning, to better guide the CRS when dealing with either popular or unpopular items. Through extensive experiments on two frequently used CRS datasets, we find the proposed model-agnostic debiasing framework not only mitigates the popularity bias in state-of-the-art CRSs but also improves the overall recommendation performance.
Users who come to recommendation platforms are heterogeneous in activity levels. There usually exists a group of core users who visit the platform regularly and consume a large body of content upon each visit, while others are casual users who tend to visit the platform occasionally and consume less each time. As a result, consumption activities from core users often dominate the training data used for learning. As core users can exhibit different activity patterns from casual users, recommender systems trained on historical user activity data usually achieve much worse performance on casual users than core users. To bridge the gap, we propose a model-agnostic framework L2Aug to improve recommendations for casual users through data augmentation, without sacrificing core user experience. L2Aug is powered by a data augmentor that learns to generate augmented interaction sequences, in order to fine-tune and optimize the performance of the recommendation system for casual users. On four real-world public datasets, L2Aug outperforms other treatment methods and achieves the best sequential recommendation performance for both casual and core users. We also test L2Aug in an online simulation environment with real-time feedback to further validate its efficacy, and showcase its flexibility in supporting different augmentation actions.
Session-based recommender systems aim to improve recommendations in short-term sessions that can be found across many platforms. A critical challenge is to accurately model user intent with only limited evidence in these short sessions. For example, is a flower bouquet being viewed meant as part of a wedding purchase or for home decoration? Such different perspectives greatly impact what should be recommended next. Hence, this paper proposes a novel session-based recommendation system empowered by hypergraph attention networks. Three unique properties of the proposed approach are: (i) it constructs a hypergraph for each session to model the item correlations defined by various contextual windows in the session simultaneously, to uncover item meanings; (ii) it is equipped with hypergraph attention layers to generate item embeddings by flexibly aggregating the contextual information from correlated items in the session; and (iii) it aggregates the dynamic item representations for each session to infer the general purpose and current need, which is decoded to infer the next interesting item in the session. Through experiments on three benchmark datasets, we find the proposed model is effective in generating informative dynamic item embeddings and providing more accurate recommendations compared to the state-of-the-art.
Inspired by the extensive success of deep learning, graph neural networks (GNNs) have been proposed to learn expressive node representations and demonstrated promising performance in various graph learning tasks. However, existing endeavors predominately focus on the conventional semi-supervised setting where relatively abundant gold-labeled nodes are provided. While it is often impractical due to the fact that data labeling is unbearably laborious and requires intensive domain knowledge, especially when considering the heterogeneity of graph-structured data. Under the few-shot semi-supervised setting, the performance of most of the existing GNNs is inevitably undermined by the overfitting and oversmoothing issues, largely owing to the shortage of labeled data. In this paper, we propose a decoupled network architecture equipped with a novel meta-learning algorithm to solve this problem. In essence, our framework Meta-PN infers high-quality pseudo labels on unlabeled nodes via a meta-learned label propagation strategy, which effectively augments the scarce labeled data while enabling large receptive fields during training. Extensive experiments demonstrate that our approach offers easy and substantial performance gains compared to existing techniques on various benchmark datasets.
Online video services acquire new content on a daily basis to increase engagement, and improve the user experience. Traditional recommender systems solely rely on watch history, delaying the recommendation of newly added titles to the right customer. However, one can use the metadata information of a cold-start title to bootstrap the personalization. In this work, we propose to adopt a two-tower model, in which one tower is to learn the user representation based on their watch history, and the other tower is to learn the effective representations for titles using metadata. The contribution of this work can be summarized as: (1) we show the feasibility of using two-tower model for recommendations and conduct a series of offline experiments to show its performance for cold-start titles; (2) we explore different types of metadata (categorical features, text description, cover-art image) and an attention layer to fuse them; (3) with our Amazon proprietary data, we show that the attention layer can assign weights adaptively to different metadata with improved recommendation for warm- and cold-start items.
A fundamental challenge for sequential recommenders is to capture the sequential patterns of users toward modeling how users transit among items. In many practical scenarios, however, there are a great number of cold-start users with only minimal logged interactions. As a result, existing sequential recommendation models will lose their predictive power due to the difficulties in learning sequential patterns over users with only limited interactions. In this work, we aim to improve sequential recommendation for cold-start users with a novel framework named MetaTL, which learns to model the transition patterns of users through meta-learning. Specifically, the proposed MetaTL: (i) formulates sequential recommendation for cold-start users as a few-shot learning problem; (ii) extracts the dynamic transition patterns among users with a translation-based architecture; and (iii) adopts meta transitional learning to enable fast learning for cold-start users with only limited interactions, leading to accurate inference of sequential interactions.
Graphs are widely used to model the relational structure of data, and the research of graph machine learning (ML) has a wide spectrum of applications ranging from drug design in molecular graphs to friendship recommendation in social networks. Prevailing approaches for graph ML typically require abundant labeled instances in achieving satisfactory results, which is commonly infeasible in real-world scenarios since labeled data for newly emerged concepts (e.g., new categorizations of nodes) on graphs is limited. Though meta-learning has been applied to different few-shot graph learning problems, most existing efforts predominately assume that all the data from those seen classes is gold-labeled, while those methods may lose their efficacy when the seen data is weakly-labeled with severe label noise. As such, we aim to investigate a novel problem of weakly-supervised graph meta-learning for improving the model robustness in terms of knowledge transfer. To achieve this goal, we propose a new graph meta-learning framework -- Graph Hallucination Networks (Meta-GHN) in this paper. Based on a new robustness-enhanced episodic training, Meta-GHN is meta-learned to hallucinate clean node representations from weakly-labeled data and extracts highly transferable meta-knowledge, which enables the model to quickly adapt to unseen tasks with few labeled instances. Extensive experiments demonstrate the superiority of Meta-GHN over existing graph meta-learning studies on the task of weakly-supervised few-shot node classification.
Recommendation algorithms typically build models based on historical user-item interactions (e.g., clicks, likes, or ratings) to provide a personalized ranked list of items. These interactions are often distributed unevenly over different groups of items due to varying user preferences. However, we show that recommendation algorithms can inherit or even amplify this imbalanced distribution, leading to unfair recommendations to item groups. Concretely, we formalize the concepts of ranking-based statistical parity and equal opportunity as two measures of fairness in personalized ranking recommendation for item groups. Then, we empirically show that one of the most widely adopted algorithms -- Bayesian Personalized Ranking -- produces unfair recommendations, which motivates our effort to propose the novel fairness-aware personalized ranking model. The debiased model is able to improve the two proposed fairness metrics while preserving recommendation performance. Experiments on three public datasets show strong fairness improvement of the proposed model versus state-of-the-art alternatives. This is paper is an extended and reorganized version of our SIGIR 2020~\cite{zhu2020measuring} paper. In this paper, we re-frame the studied problem as `item recommendation fairness' in personalized ranking recommendation systems, and provide more details about the training process of the proposed model and details of experiment setup.
Text classification is a critical research topic with broad applications in natural language processing. Recently, graph neural networks (GNNs) have received increasing attention in the research community and demonstrated their promising results on this canonical task. Despite the success, their performance could be largely jeopardized in practice since they are: (1) unable to capture high-order interaction between words; (2) inefficient to handle large datasets and new documents. To address those issues, in this paper, we propose a principled model -- hypergraph attention networks (HyperGAT), which can obtain more expressive power with less computational consumption for text representation learning. Extensive experiments on various benchmark datasets demonstrate the efficacy of the proposed approach on the text classification task.
Attributed networks nowadays are ubiquitous in a myriad of high-impact applications, such as social network analysis, financial fraud detection, and drug discovery. As a central analytical task on attributed networks, node classification has received much attention in the research community. In real-world attributed networks, a large portion of node classes only contain limited labeled instances, rendering a long-tail node class distribution. Existing node classification algorithms are unequipped to handle the \textit{few-shot} node classes. As a remedy, few-shot learning has attracted a surge of attention in the research community. Yet, few-shot node classification remains a challenging problem as we need to address the following questions: (i) How to extract meta-knowledge from an attributed network for few-shot node classification? (ii) How to identify the informativeness of each labeled instance for building a robust and effective model? To answer these questions, in this paper, we propose a graph meta-learning framework -- Graph Prototypical Networks (GPN). By constructing a pool of semi-supervised node classification tasks to mimic the real test environment, GPN is able to perform \textit{meta-learning} on an attributed network and derive a highly generalizable model for handling the target classification task. Extensive experiments demonstrate the superior capability of GPN in few-shot node classification.