Picture for Yuhang Zhou

Yuhang Zhou

From Correctness to Utility: Gain-Based Prefix Evaluation for LLM Reasoning

Add code
Jun 05, 2026
Viaarxiv icon

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

Add code
May 31, 2026
Viaarxiv icon

Agentic Recommender System with Hierarchical Belief-State Memory

Add code
May 14, 2026
Viaarxiv icon

CAR: Query-Guided Confidence-Aware Reranking for Retrieval-Augmented Generation

Add code
May 06, 2026
Viaarxiv icon

Deep Reprogramming Distillation for Medical Foundation Models

Add code
May 06, 2026
Viaarxiv icon

Synthetic Sandbox for Training Machine Learning Engineering Agents

Add code
Apr 06, 2026
Viaarxiv icon

LLM-Driven Reasoning for Constraint-Aware Feature Selection in Industrial Systems

Add code
Mar 26, 2026
Viaarxiv icon

LLM-Confidence Reranker: A Training-Free Approach for Enhancing Retrieval-Augmented Generation Systems

Add code
Feb 14, 2026
Viaarxiv icon

EBPO: Empirical Bayes Shrinkage for Stabilizing Group-Relative Policy Optimization

Add code
Feb 05, 2026
Viaarxiv icon

OffSeeker: Online Reinforcement Learning Is Not All You Need for Deep Research Agents

Add code
Jan 26, 2026
Viaarxiv icon