Picture for Shidong Yang

Shidong Yang

ReSum: Synergizing LLM Reasoning and Summarization with Reinforcement Learning

Add code
Jun 11, 2026
Viaarxiv icon

APPO: Agentic Procedural Policy Optimization

Add code
Jun 10, 2026
Viaarxiv icon

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

Add code
Jun 09, 2026
Viaarxiv icon

Learning Agentic Policy from Action Guidance

Add code
May 12, 2026
Viaarxiv icon

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Add code
Apr 09, 2026
Viaarxiv icon

Entropy-Guided Data-Efficient Training for Multimodal Reasoning Reward Models

Add code
Feb 02, 2026
Viaarxiv icon

Step-DeepResearch Technical Report

Add code
Dec 24, 2025
Viaarxiv icon

Differentiable Collision-Supervised Tooth Arrangement Network with a Decoupling Perspective

Add code
Sep 18, 2024
Figure 1 for Differentiable Collision-Supervised Tooth Arrangement Network with a Decoupling Perspective
Figure 2 for Differentiable Collision-Supervised Tooth Arrangement Network with a Decoupling Perspective
Figure 3 for Differentiable Collision-Supervised Tooth Arrangement Network with a Decoupling Perspective
Figure 4 for Differentiable Collision-Supervised Tooth Arrangement Network with a Decoupling Perspective
Viaarxiv icon