Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Beihong Jin

State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China

Deep Situation-Aware Interaction Network for Click-Through Rate Prediction

Apr 14, 2026

Yimin Lv, Shuli Wang, Beihong Jin, Yisong Yu, Yapeng Zhang, Jian Dong, Yongkang Wang, Xingxing Wang, Dong Wang

Abstract:User behavior sequence modeling plays a significant role in Click-Through Rate (CTR) prediction on e-commerce platforms. Except for the interacted items, user behaviors contain rich interaction information, such as the behavior type, time, location, etc. However, so far, the information related to user behaviors has not yet been fully exploited. In the paper, we propose the concept of a situation and situational features for distinguishing interaction behaviors and then design a CTR model named Deep Situation-Aware Interaction Network (DSAIN). DSAIN first adopts the reparameterization trick to reduce noise in the original user behavior sequences. Then it learns the embeddings of situational features by feature embedding parameterization and tri-directional correlation fusion. Finally, it obtains the embedding of behavior sequence via heterogeneous situation aggregation. We conduct extensive offline experiments on three real-world datasets. Experimental results demonstrate the superiority of the proposed DSAIN model. More importantly, DSAIN has increased the CTR by 2.70\%, the CPM by 2.62\%, and the GMV by 2.16\% in the online A/B test. Now, DSAIN has been deployed on the Meituan food delivery platform and serves the main traffic of the Meituan takeout app.

* RecSys'23 Full Paper

Via

Access Paper or Ask Questions

Unleashing the Potential of Sparse Attention on Long-term Behaviors for CTR Prediction

Jan 25, 2026

Weijiang Lai, Beihong Jin, Di Zhang, Siru Chen, Jiongyan Zhang, Yuhang Gou, Jian Dong, Xingxing Wang

Abstract:In recent years, the success of large language models (LLMs) has driven the exploration of scaling laws in recommender systems. However, models that demonstrate scaling laws are actually challenging to deploy in industrial settings for modeling long sequences of user behaviors, due to the high computational complexity of the standard self-attention mechanism. Despite various sparse self-attention mechanisms proposed in other fields, they are not fully suited for recommendation scenarios. This is because user behaviors exhibit personalization and temporal characteristics: different users have distinct behavior patterns, and these patterns change over time, with data from these users differing significantly from data in other fields in terms of distribution. To address these challenges, we propose SparseCTR, an efficient and effective model specifically designed for long-term behaviors of users. To be precise, we first segment behavior sequences into chunks in a personalized manner to avoid separating continuous behaviors and enable parallel processing of sequences. Based on these chunks, we propose a three-branch sparse self-attention mechanism to jointly identify users' global interests, interest transitions, and short-term interests. Furthermore, we design a composite relative temporal encoding via learnable, head-specific bias coefficients, better capturing sequential and periodic relationships among user behaviors. Extensive experimental results show that SparseCTR not only improves efficiency but also outperforms state-of-the-art methods. More importantly, it exhibits an obvious scaling law phenomenon, maintaining performance improvements across three orders of magnitude in FLOPs. In online A/B testing, SparseCTR increased CTR by 1.72\% and CPM by 1.41\%. Our source code is available at https://github.com/laiweijiang/SparseCTR.

* WWW 2026: THE ACM WEB CONFERENCE

Via

Access Paper or Ask Questions

Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation

Feb 22, 2025

Beibei Li, Tao Xiang, Beihong Jin, Yiyuan Zheng, Rui Zhao

Figure 1 for Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation

Figure 2 for Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation

Figure 3 for Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation

Figure 4 for Semantic Gaussian Mixture Variational Autoencoder for Sequential Recommendation

Abstract:Variational AutoEncoder (VAE) for Sequential Recommendation (SR), which learns a continuous distribution for each user-item interaction sequence rather than a determinate embedding, is robust against data deficiency and achieves significant performance. However, existing VAE-based SR models assume a unimodal Gaussian distribution as the prior distribution of sequence representations, leading to restricted capability to capture complex user interests and limiting recommendation performance when users have more than one interest. Due to that it is common for users to have multiple disparate interests, we argue that it is more reasonable to establish a multimodal prior distribution in SR scenarios instead of a unimodal one. Therefore, in this paper, we propose a novel VAE-based SR model named SIGMA. SIGMA assumes that the prior of sequence representation conforms to a Gaussian mixture distribution, where each component of the distribution semantically corresponds to one of multiple interests. For multi-interest elicitation, SIGMA includes a probabilistic multi-interest extraction module that learns a unimodal Gaussian distribution for each interest according to implicit item hyper-categories. Additionally, to incorporate the multimodal interests into sequence representation learning, SIGMA constructs a multi-interest-aware ELBO, which is compatible with the Gaussian mixture prior. Extensive experiments on public datasets demonstrate the effectiveness of SIGMA. The code is available at https://github.com/libeibei95/SIGMA.

* Accepted by DASFAA 2025

Via

Access Paper or Ask Questions

AsyCo: An Asymmetric Dual-task Co-training Model for Partial-label Learning

Jul 21, 2024

Beibei Li, Yiyuan Zheng, Beihong Jin, Tao Xiang, Haobo Wang, Lei Feng

Figure 1 for AsyCo: An Asymmetric Dual-task Co-training Model for Partial-label Learning

Figure 2 for AsyCo: An Asymmetric Dual-task Co-training Model for Partial-label Learning

Figure 3 for AsyCo: An Asymmetric Dual-task Co-training Model for Partial-label Learning

Figure 4 for AsyCo: An Asymmetric Dual-task Co-training Model for Partial-label Learning

Abstract:Partial-Label Learning (PLL) is a typical problem of weakly supervised learning, where each training instance is annotated with a set of candidate labels. Self-training PLL models achieve state-of-the-art performance but suffer from error accumulation problem caused by mistakenly disambiguated instances. Although co-training can alleviate this issue by training two networks simultaneously and allowing them to interact with each other, most existing co-training methods train two structurally identical networks with the same task, i.e., are symmetric, rendering it insufficient for them to correct each other due to their similar limitations. Therefore, in this paper, we propose an asymmetric dual-task co-training PLL model called AsyCo, which forces its two networks, i.e., a disambiguation network and an auxiliary network, to learn from different views explicitly by optimizing distinct tasks. Specifically, the disambiguation network is trained with self-training PLL task to learn label confidence, while the auxiliary network is trained in a supervised learning paradigm to learn from the noisy pairwise similarity labels that are constructed according to the learned label confidence. Finally, the error accumulation problem is mitigated via information distillation and confidence refinement. Extensive experiments on both uniform and instance-dependent partially labeled datasets demonstrate the effectiveness of AsyCo. The code is available at https://github.com/libeibeics/AsyCo.

* 15 pages, accepted by Science China, Information Science

Via

Access Paper or Ask Questions

Denoising Long- and Short-term Interests for Sequential Recommendation

Jul 20, 2024

Xinyu Zhang, Beibei Li, Beihong Jin

Figure 1 for Denoising Long- and Short-term Interests for Sequential Recommendation

Figure 2 for Denoising Long- and Short-term Interests for Sequential Recommendation

Figure 3 for Denoising Long- and Short-term Interests for Sequential Recommendation

Figure 4 for Denoising Long- and Short-term Interests for Sequential Recommendation

Abstract:User interests can be viewed over different time scales, mainly including stable long-term preferences and changing short-term intentions, and their combination facilitates the comprehensive sequential recommendation. However, existing work that focuses on different time scales of user modeling has ignored the negative effects of different time-scale noise, which hinders capturing actual user interests and cannot be resolved by conventional sequential denoising methods. In this paper, we propose a Long- and Short-term Interest Denoising Network (LSIDN), which employs different encoders and tailored denoising strategies to extract long- and short-term interests, respectively, achieving both comprehensive and robust user modeling. Specifically, we employ a session-level interest extraction and evolution strategy to avoid introducing inter-session behavioral noise into long-term interest modeling; we also adopt contrastive learning equipped with a homogeneous exchanging augmentation to alleviate the impact of unintentional behavioral noise on short-term interest modeling. Results of experiments on two public datasets show that LSIDN consistently outperforms state-of-the-art models and achieves significant robustness.

* 9 pages, accepted by SDM 2024

Via

Access Paper or Ask Questions

Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching

Jul 20, 2024

Beibei Li, Beihong Jin, Yisong Yu, Yiyuan Zheng, Jiageng Song, Wei Zhuo, Tao Xiang

Figure 1 for Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching

Figure 2 for Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching

Figure 3 for Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching

Figure 4 for Orthogonal Hyper-category Guided Multi-interest Elicitation for Micro-video Matching

Abstract:Watching micro-videos is becoming a part of public daily life. Usually, user watching behaviors are thought to be rooted in their multiple different interests. In the paper, we propose a model named OPAL for micro-video matching, which elicits a user's multiple heterogeneous interests by disentangling multiple soft and hard interest embeddings from user interactions. Moreover, OPAL employs a two-stage training strategy, in which the pre-train is to generate soft interests from historical interactions under the guidance of orthogonal hyper-categories of micro-videos and the fine-tune is to reinforce the degree of disentanglement among the interests and learn the temporal evolution of each interest of each user. We conduct extensive experiments on two real-world datasets. The results show that OPAL not only returns diversified micro-videos but also outperforms six state-of-the-art models in terms of recall and hit rate.

* 6 pages, accepted by ICME 2024

Via

Access Paper or Ask Questions

A Vlogger-augmented Graph Neural Network Model for Micro-video Recommendation

May 28, 2024

Weijiang Lai, Beihong Jin, Beibei Li, Yiyuan Zheng, Rui Zhao

Abstract:Existing micro-video recommendation models exploit the interactions between users and micro-videos and/or multi-modal information of micro-videos to predict the next micro-video a user will watch, ignoring the information related to vloggers, i.e., the producers of micro-videos. However, in micro-video scenarios, vloggers play a significant role in user-video interactions, since vloggers generally focus on specific topics and users tend to follow the vloggers they are interested in. Therefore, in the paper, we propose a vlogger-augmented graph neural network model VA-GNN, which takes the effect of vloggers into consideration. Specifically, we construct a tripartite graph with users, micro-videos, and vloggers as nodes, capturing user preferences from different views, i.e., the video-view and the vlogger-view. Moreover, we conduct cross-view contrastive learning to keep the consistency between node embeddings from the two different views. Besides, when predicting the next user-video interaction, we adaptively combine the user preferences for a video itself and its vlogger. We conduct extensive experiments on two real-world datasets. The experimental results show that VA-GNN outperforms multiple existing GNN-based recommendation models.

* (2023) Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track (pp. 684-699). Cham: Springer Nature Switzerland

Via

Access Paper or Ask Questions

A Deep Behavior Path Matching Network for Click-Through Rate Prediction

Feb 01, 2023

Jian Dong, Yisong Yu, Yapeng Zhang, Yimin Lv, Shuli Wang, Beihong Jin, Yongkang Wang, Xingxing Wang, Dong Wang

Figure 1 for A Deep Behavior Path Matching Network for Click-Through Rate Prediction

Figure 2 for A Deep Behavior Path Matching Network for Click-Through Rate Prediction

Figure 3 for A Deep Behavior Path Matching Network for Click-Through Rate Prediction

Figure 4 for A Deep Behavior Path Matching Network for Click-Through Rate Prediction

Abstract:User behaviors on an e-commerce app not only contain different kinds of feedback on items but also sometimes imply the cognitive clue of the user's decision-making. For understanding the psychological procedure behind user decisions, we present the behavior path and propose to match the user's current behavior path with historical behavior paths to predict user behaviors on the app. Further, we design a deep neural network for behavior path matching and solve three difficulties in modeling behavior paths: sparsity, noise interference, and accurate matching of behavior paths. In particular, we leverage contrastive learning to augment user behavior paths, provide behavior path self-activation to alleviate the effect of noise, and adopt a two-level matching mechanism to identify the most appropriate candidate. Our model shows excellent performance on two real-world datasets, outperforming the state-of-the-art CTR model. Moreover, our model has been deployed on the Meituan food delivery platform and has accumulated 1.6% improvement in CTR and 1.8% improvement in advertising revenue.

* Accepted by WWW2023

Via

Access Paper or Ask Questions

A New Approach to Training Multiple Cooperative Agents for Autonomous Driving

Sep 05, 2022

Ruiyang Yang, Siheng Li, Beihong Jin

Figure 1 for A New Approach to Training Multiple Cooperative Agents for Autonomous Driving

Figure 2 for A New Approach to Training Multiple Cooperative Agents for Autonomous Driving

Figure 3 for A New Approach to Training Multiple Cooperative Agents for Autonomous Driving

Figure 4 for A New Approach to Training Multiple Cooperative Agents for Autonomous Driving

Abstract:Training multiple agents to perform safe and cooperative control in the complex scenarios of autonomous driving has been a challenge. For a small fleet of cars moving together, this paper proposes Lepus, a new approach to training multiple agents. Lepus adopts a pure cooperative manner for training multiple agents, featured with the shared parameters of policy networks and the shared reward function of multiple agents. In particular, Lepus pre-trains the policy networks via an adversarial process, improving its collaborative decision-making capability and further the stability of car driving. Moreover, for alleviating the problem of sparse rewards, Lepus learns an approximate reward function from expert trajectories by combining a random network and a distillation network. We conduct extensive experiments on the MADRaS simulation platform. The experimental results show that multiple agents trained by Lepus can avoid collisions as many as possible while driving simultaneously and outperform the other four methods, that is, DDPG-FDE, PSDDPG, MADDPG, and MAGAIL(DDPG) in terms of stability.

* 8pages, IJCNN2022, Accepted

Via

Access Paper or Ask Questions

Improving Micro-video Recommendation by Controlling Position Bias

Aug 09, 2022

Yisong Yu, Beihong Jin, Jiageng Song, Beibei Li, Yiyuan Zheng, Wei Zhu

Figure 1 for Improving Micro-video Recommendation by Controlling Position Bias

Figure 2 for Improving Micro-video Recommendation by Controlling Position Bias

Figure 3 for Improving Micro-video Recommendation by Controlling Position Bias

Figure 4 for Improving Micro-video Recommendation by Controlling Position Bias

Abstract:As the micro-video apps become popular, the numbers of micro-videos and users increase rapidly, which highlights the importance of micro-video recommendation. Although the micro-video recommendation can be naturally treated as the sequential recommendation, the previous sequential recommendation models do not fully consider the characteristics of micro-video apps, and in their inductive biases, the role of positions is not in accord with the reality in the micro-video scenario. Therefore, in the paper, we present a model named PDMRec (Position Decoupled Micro-video Recommendation). PDMRec applies separate self-attention modules to model micro-video information and the positional information and then aggregate them together, avoid the noisy correlations between micro-video semantics and positional information being encoded into the sequence embeddings. Moreover, PDMRec proposes contrastive learning strategies which closely match with the characteristics of the micro-video scenario, thus reducing the interference from micro-video positions in sequences. We conduct the extensive experiments on two real-world datasets. The experimental results shows that PDMRec outperforms existing multiple state-of-the-art models and achieves significant performance improvements.

* accepted by ECML PKDD2022

Via

Access Paper or Ask Questions