Picture for Zhuowen Han

Zhuowen Han

Finding the Evidence: Discovering Decision-Supporting Tokens for On-Policy Reasoning Distillation

Add code
Jun 22, 2026
Viaarxiv icon

Learning to Act under Noise: Enhancing Agent Robustness via Noisy Environments

Add code
May 26, 2026
Viaarxiv icon

Self-Distilled Agentic Reinforcement Learning

Add code
May 14, 2026
Viaarxiv icon

MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning

Add code
May 13, 2026
Viaarxiv icon

Why Does Reinforcement Learning Generalize? A Feature-Level Mechanistic Study of Post-Training in Large Language Models

Add code
Apr 27, 2026
Viaarxiv icon

DEP: A Decentralized Large Language Model Evaluation Protocol

Add code
Mar 01, 2026
Viaarxiv icon

LongCat-Flash-Thinking-2601 Technical Report

Add code
Jan 23, 2026
Viaarxiv icon

Revisiting Entropy in Reinforcement Learning for Large Reasoning Models

Add code
Nov 08, 2025
Viaarxiv icon