Recommendation algorithms forecast user preferences by correlating user and item representations derived from historical interaction patterns. In pursuit of enhanced performance, many methods focus on learning robust and independent representations by disentangling the intricate factors within interaction data across various modalities in an unsupervised manner. However, such an approach obfuscates the discernment of how specific factors (e.g., category or brand) influence the outcomes, making it challenging to regulate their effects. In response to this challenge, we introduce a novel method called Attribute-Driven Disentangled Representation Learning (short for AD-DRL), which explicitly incorporates attributes from different modalities into the disentangled representation learning process. By assigning a specific attribute to each factor in multimodal features, AD-DRL can disentangle the factors at both attribute and attribute-value levels. To obtain robust and independent representations for each factor associated with a specific attribute, we first disentangle the representations of features both within and across different modalities. Moreover, we further enhance the robustness of the representations by fusing the multimodal features of the same factor. Empirical evaluations conducted on three public real-world datasets substantiate the effectiveness of AD-DRL, as well as its interpretability and controllability.
Knowledge graph (KG) based reasoning has been regarded as an effective means for the analysis of semantic networks and is of great usefulness in areas of information retrieval, recommendation, decision-making, and man-machine interaction. It is widely used in recommendation, decision-making, question-answering, search, and other fields. However, previous studies mainly used low-level knowledge in the KG for reasoning, which may result in insufficient generalization and poor robustness of reasoning. To this end, this paper proposes a new inference approach using a novel knowledge augmentation strategy to improve the generalization capability of KG. This framework extracts high-level pyramidal knowledge from low-level knowledge and applies it to reasoning in a multi-level hierarchical KG, called knowledge pyramid in this paper. We tested some medical data sets using the proposed approach, and the experimental results show that the proposed knowledge pyramid has improved the knowledge inference performance with better generalization. Especially, when there are fewer training samples, the inference accuracy can be significantly improved.
Topic models have been prevalent for decades to discover latent topics and infer topic proportions of documents in an unsupervised fashion. They have been widely used in various applications like text analysis and context recommendation. Recently, the rise of neural networks has facilitated the emergence of a new research field -- Neural Topic Models (NTMs). Different from conventional topic models, NTMs directly optimize parameters without requiring model-specific derivations. This endows NTMs with better scalability and flexibility, resulting in significant research attention and plentiful new methods and applications. In this paper, we present a comprehensive survey on neural topic models concerning methods, applications, and challenges. Specifically, we systematically organize current NTM methods according to their network structures and introduce the NTMs for various scenarios like short texts and cross-lingual documents. We also discuss a wide range of popular applications built on NTMs. Finally, we highlight the challenges confronted by NTMs to inspire future research.
Learning utility, or reward, models from pairwise comparisons is a fundamental component in a number of application domains. These approaches inherently entail collecting preference information from people, with feedback often provided anonymously. Since preferences are subjective, there is no gold standard to compare against; yet, reliance of high-impact systems on preference learning creates a strong motivation for malicious actors to skew data collected in this fashion to their ends. We investigate the nature and extent of this vulnerability systematically by considering a threat model in which an attacker can flip a small subset of preference comparisons with the goal of either promoting or demoting a target outcome. First, we propose two classes of algorithmic approaches for these attacks: a principled gradient-based framework, and several variants of rank-by-distance methods. Next, we demonstrate the efficacy of best attacks in both these classes in successfully achieving malicious goals on datasets from three diverse domains: autonomous control, recommendation system, and textual prompt-response preference learning. We find that the best attacks are often highly successful, achieving in the most extreme case 100% success rate with only 0.3% of the data poisoned. However, which attack is best can vary significantly across domains, demonstrating the value of our comprehensive vulnerability analysis that involves several classes of attack algorithms. In addition, we observe that the simpler and more scalable rank-by-distance approaches are often competitive with the best, and on occasion significantly outperform gradient-based methods. Finally, we show that several state-of-the-art defenses against other classes of poisoning attacks exhibit, at best, limited efficacy in our setting.
Temporal Point Processes (TPPs) hold a pivotal role in modeling event sequences across diverse domains, including social networking and e-commerce, and have significantly contributed to the advancement of recommendation systems and information retrieval strategies. Through the analysis of events such as user interactions and transactions, TPPs offer valuable insights into behavioral patterns, facilitating the prediction of future trends. However, accurately forecasting future events remains a formidable challenge due to the intricate nature of these patterns. The integration of Neural Networks with TPPs has ushered in the development of advanced deep TPP models. While these models excel at processing complex and nonlinear temporal data, they encounter limitations in modeling intensity functions, grapple with computational complexities in integral computations, and struggle to capture long-range temporal dependencies effectively. In this study, we introduce the CuFun model, representing a novel approach to TPPs that revolves around the Cumulative Distribution Function (CDF). CuFun stands out by uniquely employing a monotonic neural network for CDF representation, utilizing past events as a scaling factor. This innovation significantly bolsters the model's adaptability and precision across a wide range of data scenarios. Our approach addresses several critical issues inherent in traditional TPP modeling: it simplifies log-likelihood calculations, extends applicability beyond predefined density function forms, and adeptly captures long-range temporal patterns. Our contributions encompass the introduction of a pioneering CDF-based TPP model, the development of a methodology for incorporating past event information into future event prediction, and empirical validation of CuFun's effectiveness through extensive experimentation on synthetic and real-world datasets.
The goal of session-based recommendation in E-commerce is to predict the next item that an anonymous user will purchase based on the browsing and purchase history. However, constructing global or local transition graphs to supplement session data can lead to noisy correlations and user intent vanishing. In this work, we propose the Frequent Attribute Pattern Augmented Transformer (FAPAT) that characterizes user intents by building attribute transition graphs and matching attribute patterns. Specifically, the frequent and compact attribute patterns are served as memory to augment session representations, followed by a gate and a transformer block to fuse the whole session information. Through extensive experiments on two public benchmarks and 100 million industrial data in three domains, we demonstrate that FAPAT consistently outperforms state-of-the-art methods by an average of 4.5% across various evaluation metrics (Hits, NDCG, MRR). Besides evaluating the next-item prediction, we estimate the models' capabilities to capture user intents via predicting items' attributes and period-item recommendations.
Existing work has revealed that large-scale offline evaluation of recommender systems for user-item interactions is prone to bias caused by the deployed system itself, as a form of closed loop feedback. Many adopt the \textit{propensity} concept to analyze or mitigate this empirical issue. In this work, we extend the analysis to session-based setup and adapted propensity calculation to the unique characteristics of session-based recommendation tasks. Our experiments incorporate neural models and KNN-based models, and cover both the music and the e-commerce domain. We study the distributions of propensity and different stratification techniques on different datasets and find that propensity-related traits are actually dataset-specific. We then leverage the effect of stratification and achieve promising results compared to the original models.
Session-based recommendation (SR) aims to dynamically recommend items to a user based on a sequence of the most recent user-item interactions. Most existing studies on SR adopt advanced deep learning methods. However, the majority only consider a special behavior type (e.g., click), while those few considering multi-typed behaviors ignore to take full advantage of the relationships between products (items). In this case, the paper proposes a novel approach, called Substitutable and Complementary Relationships from Multi-behavior Data (denoted as SCRM) to better explore the relationships between products for effective recommendation. Specifically, we firstly construct substitutable and complementary graphs based on a user's sequential behaviors in every session by jointly considering `click' and `purchase' behaviors. We then design a denoising network to remove false relationships, and further consider constraints on the two relationships via a particularly designed loss function. Extensive experiments on two e-commerce datasets demonstrate the superiority of our model over state-of-the-art methods, and the effectiveness of every component in SCRM.
Bundle recommendations strive to offer users a set of items as a package named bundle, enhancing convenience and contributing to the seller's revenue. While previous approaches have demonstrated notable performance, we argue that they may compromise the ternary relationship among users, items, and bundles. This compromise can result in information loss, ultimately impacting the overall model performance. To address this gap, we develop a unified model for bundle recommendation, termed hypergraph-enhanced dual convolutional neural network (HED). Our approach is characterized by two key aspects. Firstly, we construct a complete hypergraph to capture interaction dynamics among users, items, and bundles. Secondly, we incorporate U-B interaction information to enhance the information representation derived from users and bundle embedding vectors. Extensive experimental results on the Youshu and Netease datasets have demonstrated that HED surpasses state-of-the-art baselines, proving its effectiveness. In addition, various ablation studies and sensitivity analyses revealed the working mechanism and proved our effectiveness. Codes and datasets are available at https://github.com/AAI-Lab/HED
Modern recommender systems (RS) have seen substantial success, yet they remain vulnerable to malicious activities, notably poisoning attacks. These attacks involve injecting malicious data into the training datasets of RS, thereby compromising their integrity and manipulating recommendation outcomes for gaining illicit profits. This survey paper provides a systematic and up-to-date review of the research landscape on Poisoning Attacks against Recommendation (PAR). A novel and comprehensive taxonomy is proposed, categorizing existing PAR methodologies into three distinct categories: Component-Specific, Goal-Driven, and Capability Probing. For each category, we discuss its mechanism in detail, along with associated methods. Furthermore, this paper highlights potential future research avenues in this domain. Additionally, to facilitate and benchmark the empirical comparison of PAR, we introduce an open-source library, ARLib, which encompasses a comprehensive collection of PAR models and common datasets. The library is released at https://github.com/CoderWZW/ARLib.