Picture for Sirui Han

Sirui Han

LABSHIELD: A Multimodal Benchmark for Safety-Critical Reasoning and Planning in Scientific Laboratories

Add code
Mar 12, 2026
Viaarxiv icon

Not Just the Destination, But the Journey: Reasoning Traces Causally Shape Generalization Behaviors

Add code
Mar 12, 2026
Viaarxiv icon

DC-W2S: Dual-Consensus Weak-to-Strong Training for Reliable Process Reward Modeling in Biological Reasoning

Add code
Mar 09, 2026
Viaarxiv icon

TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation

Add code
Feb 09, 2026
Viaarxiv icon

What, Whether and How? Unveiling Process Reward Models for Thinking with Images Reasoning

Add code
Feb 09, 2026
Viaarxiv icon

MemFly: On-the-Fly Memory Optimization via Information Bottleneck

Add code
Feb 08, 2026
Viaarxiv icon

Learning While Staying Curious: Entropy-Preserving Supervised Fine-Tuning via Adaptive Self-Distillation for Large Reasoning Models

Add code
Feb 02, 2026
Viaarxiv icon

Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification

Add code
Jan 30, 2026
Viaarxiv icon

Glance-or-Gaze: Incentivizing LMMs to Adaptively Focus Search via Reinforcement Learning

Add code
Jan 20, 2026
Viaarxiv icon

LRAS: Advanced Legal Reasoning with Agentic Search

Add code
Jan 12, 2026
Viaarxiv icon