Picture for Zhiguo Yang

Zhiguo Yang

Bridging SFT and RL: Dynamic Policy Optimization for Robust Reasoning

Add code
Apr 10, 2026
Viaarxiv icon

Revisiting the Data Sampling in Multimodal Post-training from a Difficulty-Distinguish View

Add code
Nov 10, 2025
Viaarxiv icon

Beyond Scaling Law: A Data-Efficient Distillation Framework for Reasoning

Add code
Aug 13, 2025
Viaarxiv icon

Structure-Aware Corpus Construction and User-Perception-Aligned Metrics for Large-Language-Model Code Completion

Add code
May 19, 2025
Viaarxiv icon