Picture for Wenping Hu

Wenping Hu

DARL: Encouraging Diverse Answers for General Reasoning without Verifiers

Add code
Jan 21, 2026
Viaarxiv icon

From Tags to Trees: Structuring Fine-Grained Knowledge for Controllable Data Selection in LLM Instruction Tuning

Add code
Jan 20, 2026
Viaarxiv icon

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models

Add code
Sep 30, 2025
Viaarxiv icon

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

Add code
Aug 12, 2025
Figure 1 for Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Figure 2 for Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Figure 3 for Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Figure 4 for Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Viaarxiv icon

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models

Add code
Dec 10, 2024
Viaarxiv icon