Picture for Shixuan Liu

Shixuan Liu

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Add code
Mar 19, 2026
Viaarxiv icon

Revealing Behavioral Plasticity in Large Language Models: A Token-Conditional Perspective

Add code
Mar 09, 2026
Viaarxiv icon

Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

Add code
Feb 04, 2026
Viaarxiv icon

Detecting Unobserved Confounders: A Kernelized Regression Approach

Add code
Jan 01, 2026
Viaarxiv icon

Learning complete and explainable visual representations from itemized text supervision

Add code
Dec 11, 2025
Viaarxiv icon

CoCo-MILP: Inter-Variable Contrastive and Intra-Constraint Competitive MILP Solution Prediction

Add code
Nov 12, 2025
Viaarxiv icon

GRACE: Generative Representation Learning via Contrastive Policy Optimization

Add code
Oct 06, 2025
Viaarxiv icon

Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security

Add code
Jul 29, 2025
Viaarxiv icon

Group Sequence Policy Optimization

Add code
Jul 24, 2025
Figure 1 for Group Sequence Policy Optimization
Figure 2 for Group Sequence Policy Optimization
Figure 3 for Group Sequence Policy Optimization
Viaarxiv icon

Stable Reinforcement Learning for Efficient Reasoning

Add code
May 23, 2025
Figure 1 for Stable Reinforcement Learning for Efficient Reasoning
Figure 2 for Stable Reinforcement Learning for Efficient Reasoning
Figure 3 for Stable Reinforcement Learning for Efficient Reasoning
Figure 4 for Stable Reinforcement Learning for Efficient Reasoning
Viaarxiv icon