Picture for Lifeng Shang

Lifeng Shang

ARTIS: Agentic Risk-Aware Test-Time Scaling via Iterative Simulation

Add code
Feb 03, 2026
Viaarxiv icon

InfMem: Learning System-2 Memory Control for Long-Context Agent

Add code
Feb 02, 2026
Viaarxiv icon

OVD: On-policy Verbal Distillation

Add code
Jan 29, 2026
Viaarxiv icon

From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation

Add code
Jan 26, 2026
Viaarxiv icon

Teaching Large Reasoning Models Effective Reflection

Add code
Jan 19, 2026
Viaarxiv icon

Group Pattern Selection Optimization: Let LRMs Pick the Right Pattern for Reasoning

Add code
Jan 12, 2026
Viaarxiv icon

SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving

Add code
Jan 07, 2026
Viaarxiv icon

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Add code
Dec 27, 2025
Viaarxiv icon

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents

Add code
Dec 23, 2025
Viaarxiv icon

Rethinking Expert Trajectory Utilization in LLM Post-training

Add code
Dec 12, 2025
Viaarxiv icon