Picture for Zihan Dong

Zihan Dong

Decompose Sparsely Where You Should, Absorb Densely Where You Should No

Add code
Jun 12, 2026
Viaarxiv icon

RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

Add code
May 27, 2026
Viaarxiv icon

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Add code
May 19, 2026
Viaarxiv icon

Evidence Over Plans: Online Trajectory Verification for Skill Distillation

Add code
May 09, 2026
Viaarxiv icon

DARE: Difficulty-Adaptive Reinforcement Learning with Co-Evolved Difficulty Estimation

Add code
May 09, 2026
Viaarxiv icon

LUCID-SAE: Learning Unified Vision-Language Sparse Codes for Interpretable Concept Discovery

Add code
Feb 07, 2026
Viaarxiv icon

Evaluating LLMs When They Do Not Know the Answer: Statistical Evaluation of Mathematical Reasoning via Comparative Signals

Add code
Feb 03, 2026
Viaarxiv icon

Alternating Reinforcement Learning for Rubric-Based Reward Modeling in Non-Verifiable LLM Post-Training

Add code
Feb 02, 2026
Viaarxiv icon

Labels or Preferences? Budget-Constrained Learning with Human Judgments over AI-Generated Outputs

Add code
Jan 19, 2026
Viaarxiv icon

Students' Perceptions and Preferences of Generative Artificial Intelligence Feedback for Programming

Add code
Dec 17, 2023
Viaarxiv icon