Picture for Yuhang Zhou

Yuhang Zhou

Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation

Add code
Jun 18, 2025
Viaarxiv icon

ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs

Add code
Jun 11, 2025
Viaarxiv icon

OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Add code
May 29, 2025
Viaarxiv icon

Conf-GNNRec: Quantifying and Calibrating the Prediction Confidence for GNN-based Recommendation Methods

Add code
May 22, 2025
Viaarxiv icon

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

Add code
May 21, 2025
Viaarxiv icon

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Add code
May 21, 2025
Viaarxiv icon

Teach2Eval: An Indirect Evaluation Method for LLM by Judging How It Teaches

Add code
May 18, 2025
Viaarxiv icon

DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective

Add code
Mar 17, 2025
Viaarxiv icon

Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models

Add code
Mar 13, 2025
Viaarxiv icon

DivIL: Unveiling and Addressing Over-Invariance for Out-of- Distribution Generalization

Add code
Feb 18, 2025
Viaarxiv icon