Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiajie Su

Generalizable Multimodal Large Language Model Editing via Invariant Trajectory Learning

Jan 30, 2026

Jiajie Su, Haoyuan Wang, Xiaohua Feng, Yunshan Ma, Xiaobo Xia, Yuyuan Li, Xiaolin Zheng, Jianmao Xiao, Chaochao Chen

Abstract:Knowledge editing emerges as a crucial technique for efficiently correcting incorrect or outdated knowledge in large language models (LLM). Existing editing methods rely on a rigid mapping from parameter or module modifications to output, which causes the generalization limitation in Multimodal LLM (MLLM). In this paper, we reformulate MLLM editing as an out-of-distribution (OOD) generalization problem, where the goal is to discern semantic shift with factual shift and thus achieve robust editing among diverse cross-modal prompting. The key challenge of this OOD problem lies in identifying invariant causal trajectories that generalize accurately while suppressing spurious correlations. To address it, we propose ODEdit, a plug-and-play invariant learning based framework that optimizes the tripartite OOD risk objective to simultaneously enhance editing reliability, locality, and generality.We further introduce an edit trajectory invariant learning method, which integrates a total variation penalty into the risk minimization objective to stabilize edit trajectories against environmental variations. Theoretical analysis and extensive experiments demonstrate the effectiveness of ODEdit.

Via

Access Paper or Ask Questions

Potent but Stealthy: Rethink Profile Pollution against Sequential Recommendation via Bi-level Constrained Reinforcement Paradigm

Nov 14, 2025

Jiajie Su, Zihan Nan, Yunshan Ma, Xiaobo Xia, Xiaohua Feng, Weiming Liu, Xiang Chen, Xiaolin Zheng, Chaochao Chen

Abstract:Sequential Recommenders, which exploit dynamic user intents through interaction sequences, is vulnerable to adversarial attacks. While existing attacks primarily rely on data poisoning, they require large-scale user access or fake profiles thus lacking practicality. In this paper, we focus on the Profile Pollution Attack that subtly contaminates partial user interactions to induce targeted mispredictions. Previous PPA methods suffer from two limitations, i.e., i) over-reliance on sequence horizon impact restricts fine-grained perturbations on item transitions, and ii) holistic modifications cause detectable distribution shifts. To address these challenges, we propose a constrained reinforcement driven attack CREAT that synergizes a bi-level optimization framework with multi-reward reinforcement learning to balance adversarial efficacy and stealthiness. We first develop a Pattern Balanced Rewarding Policy, which integrates pattern inversion rewards to invert critical patterns and distribution consistency rewards to minimize detectable shifts via unbalanced co-optimal transport. Then we employ a Constrained Group Relative Reinforcement Learning paradigm, enabling step-wise perturbations through dynamic barrier constraints and group-shared experience replay, achieving targeted pollution with minimal detectability. Extensive experiments demonstrate the effectiveness of CREAT.

Via

Access Paper or Ask Questions

A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation

Nov 11, 2025

Qiyong Zhong, Jiajie Su, Ming Yang, Yunshan Ma, Xiaolin Zheng, Chaochao Chen

Figure 1 for A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation

Figure 2 for A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation

Figure 3 for A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation

Figure 4 for A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation

Abstract:Sequential recommendations (SR) predict users' future interactions based on their historical behavior. The rise of Large Language Models (LLMs) has brought powerful generative and reasoning capabilities, significantly enhancing SR performance, while Multimodal LLMs (MLLMs) further extend this by introducing data like images and interactive relationships. However, critical issues remain, i.e., (a) Suboptimal item representations caused by lengthy and redundant descriptions, leading to inefficiencies in both training and inference; (b) Modality-related cognitive bias, as LLMs are predominantly pretrained on textual data, limiting their ability to effectively integrate and utilize non-textual modalities; (c) Weakening sequential perception in long interaction sequences, where attention mechanisms struggle to capture earlier interactions, hindering the modeling of long-range dependencies. To address these issues, we propose Speeder, an efficient MLLM-based paradigm for SR featuring three key innovations: 1) Multimodal Representation Compression (MRC), which condenses item attributes into concise yet informative tokens, reducing redundancy and computational cost; 2) Modality-aware Progressive Optimization (MPO), enabling gradual learning of multimodal representations; 3) Sequential Position Awareness Enhancement (SPAE), improving the LLM's capability to capture both relative and absolute sequential dependencies in long interaction sequences. Extensive experiments on real-world datasets demonstrate the effectiveness and efficiency of Speeder. Speeder increases training speed to 250% of the original while reducing inference time to 25% on the Amazon dataset.

Via

Access Paper or Ask Questions

Distilling Transitional Pattern to Large Language Models for Multimodal Session-based Recommendation

Apr 13, 2025

Jiajie Su, Qiyong Zhong, Yunshan Ma, Weiming Liu, Chaochao Chen, Xiaolin Zheng, Jianwei Yin, Tat-Seng Chua

Abstract:Session-based recommendation (SBR) predicts the next item based on anonymous sessions. Traditional SBR explores user intents based on ID collaborations or auxiliary content. To further alleviate data sparsity and cold-start issues, recent Multimodal SBR (MSBR) methods utilize simplistic pre-trained models for modality learning but have limitations in semantic richness. Considering semantic reasoning abilities of Large Language Models (LLM), we focus on the LLM-enhanced MSBR scenario in this paper, which leverages LLM cognition for comprehensive multimodal representation generation, to enhance downstream MSBR. Tackling this problem faces two challenges: i) how to obtain LLM cognition on both transitional patterns and inherent multimodal knowledge, ii) how to align both features into one unified LLM, minimize discrepancy while maximizing representation utility. To this end, we propose a multimodal LLM-enhanced framework TPAD, which extends a distillation paradigm to decouple and align transitional patterns for promoting MSBR. TPAD establishes parallel Knowledge-MLLM and Transfer-MLLM, where the former interprets item knowledge-reflected features and the latter extracts transition-aware features underneath sessions. A transitional pattern alignment module harnessing mutual information estimation theory unites two MLLMs, alleviating distribution discrepancy and distilling transitional patterns into modal representations. Extensive experiments on real-world datasets demonstrate the effectiveness of our framework.

Via

Access Paper or Ask Questions

Enhancing Attributed Graph Networks with Alignment and Uniformity Constraints for Session-based Recommendation

Oct 14, 2024

Xinping Zhao, Chaochao Chen, Jiajie Su, Yizhao Zhang, Baotian Hu

Figure 1 for Enhancing Attributed Graph Networks with Alignment and Uniformity Constraints for Session-based Recommendation

Figure 2 for Enhancing Attributed Graph Networks with Alignment and Uniformity Constraints for Session-based Recommendation

Figure 3 for Enhancing Attributed Graph Networks with Alignment and Uniformity Constraints for Session-based Recommendation

Figure 4 for Enhancing Attributed Graph Networks with Alignment and Uniformity Constraints for Session-based Recommendation

Abstract:Session-based Recommendation (SBR), seeking to predict a user's next action based on an anonymous session, has drawn increasing attention for its practicability. Most SBR models only rely on the contextual transitions within a short session to learn item representations while neglecting additional valuable knowledge. As such, their model capacity is largely limited by the data sparsity issue caused by short sessions. A few studies have exploited the Modeling of Item Attributes (MIA) to enrich item representations. However, they usually involve specific model designs that can hardly transfer to existing attribute-agnostic SBR models and thus lack universality. In this paper, we propose a model-agnostic framework, named AttrGAU (Attributed Graph Networks with Alignment and Uniformity Constraints), to bring the MIA's superiority into existing attribute-agnostic models, to improve their accuracy and robustness for recommendation. Specifically, we first build a bipartite attributed graph and design an attribute-aware graph convolution to exploit the rich attribute semantics hidden in the heterogeneous item-attribute relationship. We then decouple existing attribute-agnostic SBR models into the graph neural network and attention readout sub-modules to satisfy the non-intrusive requirement. Lastly, we design two representation constraints, i.e., alignment and uniformity, to optimize distribution discrepancy in representation between the attribute semantics and collaborative semantics. Extensive experiments on three public benchmark datasets demonstrate that the proposed AttrGAU framework can significantly enhance backbone models' recommendation performance and robustness against data sparsity and data noise issues. Our implementation codes will be available at https://github.com/ItsukiFujii/AttrGAU.

* 11 pages, 4 figures, 5 tables. Accepted by ICWS 2024

Via

Access Paper or Ask Questions

Personalized Behavior-Aware Transformer for Multi-Behavior Sequential Recommendation

Feb 22, 2024

Jiajie Su, Chaochao Chen, Zibin Lin, Xi Li, Weiming Liu, Xiaolin Zheng

Figure 1 for Personalized Behavior-Aware Transformer for Multi-Behavior Sequential Recommendation

Figure 2 for Personalized Behavior-Aware Transformer for Multi-Behavior Sequential Recommendation

Figure 3 for Personalized Behavior-Aware Transformer for Multi-Behavior Sequential Recommendation

Figure 4 for Personalized Behavior-Aware Transformer for Multi-Behavior Sequential Recommendation

Abstract:Sequential Recommendation (SR) captures users' dynamic preferences by modeling how users transit among items. However, SR models that utilize only single type of behavior interaction data encounter performance degradation when the sequences are short. To tackle this problem, we focus on Multi-Behavior Sequential Recommendation (MBSR) in this paper, which aims to leverage time-evolving heterogeneous behavioral dependencies for better exploring users' potential intents on the target behavior. Solving MBSR is challenging. On the one hand, users exhibit diverse multi-behavior patterns due to personal characteristics. On the other hand, there exists comprehensive co-influence between behavior correlations and item collaborations, the intensity of which is deeply affected by temporal factors. To tackle these challenges, we propose a Personalized Behavior-Aware Transformer framework (PBAT) for MBSR problem, which models personalized patterns and multifaceted sequential collaborations in a novel way to boost recommendation performance. First, PBAT develops a personalized behavior pattern generator in the representation layer, which extracts dynamic and discriminative behavior patterns for sequential learning. Second, PBAT reforms the self-attention layer with a behavior-aware collaboration extractor, which introduces a fused behavior-aware attention mechanism for incorporating both behavioral and temporal impacts into collaborative transitions. We conduct experiments on three benchmark datasets and the results demonstrate the effectiveness and interpretability of our framework. Our implementation code is released at https://github.com/TiliaceaeSU/PBAT.

* Proceedings of the 31st ACM International Conference on Multimedia. 2023: 6321-6331

Via

Access Paper or Ask Questions

DDGHM: Dual Dynamic Graph with Hybrid Metric Training for Cross-Domain Sequential Recommendation

Sep 21, 2022

Xiaolin Zheng, Jiajie Su, Weiming Liu, Chaochao Chen

Figure 1 for DDGHM: Dual Dynamic Graph with Hybrid Metric Training for Cross-Domain Sequential Recommendation

Figure 2 for DDGHM: Dual Dynamic Graph with Hybrid Metric Training for Cross-Domain Sequential Recommendation

Figure 3 for DDGHM: Dual Dynamic Graph with Hybrid Metric Training for Cross-Domain Sequential Recommendation

Figure 4 for DDGHM: Dual Dynamic Graph with Hybrid Metric Training for Cross-Domain Sequential Recommendation

Abstract:Sequential Recommendation (SR) characterizes evolving patterns of user behaviors by modeling how users transit among items. However, the short interaction sequences limit the performance of existing SR. To solve this problem, we focus on Cross-Domain Sequential Recommendation (CDSR) in this paper, which aims to leverage information from other domains to improve the sequential recommendation performance of a single domain. Solving CDSR is challenging. On the one hand, how to retain single domain preferences as well as integrate cross-domain influence remains an essential problem. On the other hand, the data sparsity problem cannot be totally solved by simply utilizing knowledge from other domains, due to the limited length of the merged sequences. To address the challenges, we propose DDGHM, a novel framework for the CDSR problem, which includes two main modules, i.e., dual dynamic graph modeling and hybrid metric training. The former captures intra-domain and inter-domain sequential transitions through dynamically constructing two-level graphs, i.e., the local graphs and the global graph, and incorporating them with a fuse attentive gating mechanism. The latter enhances user and item representations by employing hybrid metric learning, including collaborative metric for achieving alignment and contrastive metric for preserving uniformity, to further alleviate data sparsity issue and improve prediction accuracy. We conduct experiments on two benchmark datasets and the results demonstrate the effectiveness of DDHMG.

Via

Access Paper or Ask Questions

Differential Private Knowledge Transfer for Privacy-Preserving Cross-Domain Recommendation

Feb 10, 2022

Chaochao Chen, Huiwen Wu, Jiajie Su, Lingjuan Lyu, Xiaolin Zheng, Li Wang

Figure 1 for Differential Private Knowledge Transfer for Privacy-Preserving Cross-Domain Recommendation

Figure 2 for Differential Private Knowledge Transfer for Privacy-Preserving Cross-Domain Recommendation

Figure 3 for Differential Private Knowledge Transfer for Privacy-Preserving Cross-Domain Recommendation

Figure 4 for Differential Private Knowledge Transfer for Privacy-Preserving Cross-Domain Recommendation

Abstract:Cross Domain Recommendation (CDR) has been popularly studied to alleviate the cold-start and data sparsity problem commonly existed in recommender systems. CDR models can improve the recommendation performance of a target domain by leveraging the data of other source domains. However, most existing CDR models assume information can directly 'transfer across the bridge', ignoring the privacy issues. To solve the privacy concern in CDR, in this paper, we propose a novel two stage based privacy-preserving CDR framework (PriCDR). In the first stage, we propose two methods, i.e., Johnson-Lindenstrauss Transform (JLT) based and Sparse-awareJLT (SJLT) based, to publish the rating matrix of the source domain using differential privacy. We theoretically analyze the privacy and utility of our proposed differential privacy based rating publishing methods. In the second stage, we propose a novel heterogeneous CDR model (HeteroCDR), which uses deep auto-encoder and deep neural network to model the published source rating matrix and target rating matrix respectively. To this end, PriCDR can not only protect the data privacy of the source domain, but also alleviate the data sparsity of the source domain. We conduct experiments on two benchmark datasets and the results demonstrate the effectiveness of our proposed PriCDR and HeteroCDR.

* Accepted by TheWebConf'22 (WWW'22)

Via

Access Paper or Ask Questions