Picture for Yuxuan Sheng

Yuxuan Sheng

OP-GRPO: Efficient Off-Policy GRPO for Flow-Matching Models

Add code
Apr 05, 2026
Viaarxiv icon

Harnessing RLHF for Robust Unanswerability Recognition and Trustworthy Response Generation in LLMs

Add code
Jul 22, 2025
Viaarxiv icon