Abstract:The matrix contextual bandit (CB), as an extension of the well-known multi-armed bandit, is a powerful framework that has been widely applied in sequential decision-making scenarios involving low-rank structure. In many real-world scenarios, such as online advertising and recommender systems, additional graph information often exists beyond the low-rank structure, that is, the similar relationships among users/items can be naturally captured through the connectivity among nodes in the corresponding graphs. However, existing matrix CB methods fail to explore such graph information, and thereby making them difficult to generate effective decision-making policies. To fill in this void, we propose in this paper a novel matrix CB algorithmic framework that builds upon the classical upper confidence bound (UCB) framework. This new framework can effectively integrate both the low-rank structure and graph information in a unified manner. Specifically, it involves first solving a joint nuclear norm and matrix Laplacian regularization problem, followed by the implementation of a graph-based generalized linear version of the UCB algorithm. Rigorous theoretical analysis demonstrates that our procedure outperforms several popular alternatives in terms of cumulative regret bound, owing to the effective utilization of graph information. A series of synthetic and real-world data experiments are conducted to further illustrate the merits of our procedure.
Abstract:We consider the problem of matrix completion with graphs as side information depicting the interrelations between variables. The key challenge lies in leveraging the similarity structure of the graph to enhance matrix recovery. Existing approaches, primarily based on graph Laplacian regularization, suffer from several limitations: (1) they focus only on the similarity between neighboring variables, while overlooking long-range correlations; (2) they are highly sensitive to false edges in the graphs and (3) they lack theoretical guarantees regarding statistical and computational complexities. To address these issues, we propose in this paper a novel graph regularized matrix completion algorithm called GSGD, based on preconditioned projected gradient descent approach. We demonstrate that GSGD effectively captures the higher-order correlation information behind the graphs, and achieves superior robustness and stability against the false edges. Theoretically, we prove that GSGD achieves linear convergence to the global optimum with near-optimal sample complexity, providing the first theoretical guarantees for both recovery accuracy and efficacy in the perspective of nonconvex optimization. Our numerical experiments on both synthetic and real-world data further validate that GSGD achieves superior recovery accuracy and scalability compared with several popular alternatives.
Abstract:Top-$K$ recommendation involves inferring latent user preferences and generating personalized recommendations accordingly, which is now ubiquitous in various decision systems. Nonetheless, recommender systems usually suffer from severe \textit{popularity bias}, leading to the over-recommendation of popular items. Such a bias deviates from the central aim of reflecting user preference faithfully, compromising both customer satisfaction and retailer profits. Despite the prevalence, existing methods tackling popularity bias still have limitations due to the considerable accuracy-debias tradeoff and the sensitivity to extensive parameter selection, further exacerbated by the extreme sparsity in positive user-item interactions. In this paper, we present a \textbf{Pop}ularity-aware top-$K$ recommendation algorithm integrating multi-behavior \textbf{S}ide \textbf{I}nformation (PopSI), aiming to enhance recommendation accuracy and debias performance simultaneously. Specifically, by leveraging multiple user feedback that mirrors similar user preferences and formulating it as a three-dimensional tensor, PopSI can utilize all slices to capture the desiring user preferences effectively. Subsequently, we introduced a novel orthogonality constraint to refine the estimated item feature space, enforcing it to be invariant to item popularity features thereby addressing our model's sensitivity to popularity bias. Comprehensive experiments on real-world e-commerce datasets demonstrate the general improvements of PopSI over state-of-the-art debias methods with a marginal accuracy-debias tradeoff and scalability to practical applications. The source code for our algorithm and experiments is available at \url{https://github.com/Eason-sys/PopSI}.