Picture for Dinggen Zhang

Dinggen Zhang

STRIDE: Strategic Trajectory Reasoning via Discriminative Estimation for Verifiable Reinforcement Learning

Add code
Jun 14, 2026
Viaarxiv icon

Plan Then Action:High-Level Planning Guidance Reinforcement Learning for LLM Reasoning

Add code
Oct 02, 2025
Viaarxiv icon