Picture for Ziqiang Dong

Ziqiang Dong

Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation

Add code
Jun 17, 2026
Viaarxiv icon

RAVE: Re-Allocating Visual Attention in Large Multimodal Models

Add code
May 18, 2026
Viaarxiv icon

Revisiting Reinforcement Learning with Verifiable Rewards from a Contrastive Perspective

Add code
May 13, 2026
Viaarxiv icon

ADHint: Adaptive Hints with Difficulty Priors for Reinforcement Learning

Add code
Dec 15, 2025
Figure 1 for ADHint: Adaptive Hints with Difficulty Priors for Reinforcement Learning
Figure 2 for ADHint: Adaptive Hints with Difficulty Priors for Reinforcement Learning
Figure 3 for ADHint: Adaptive Hints with Difficulty Priors for Reinforcement Learning
Figure 4 for ADHint: Adaptive Hints with Difficulty Priors for Reinforcement Learning
Viaarxiv icon