Picture for Erhan Xu

Erhan Xu

Demystifying Group Relative Policy Optimization: Its Policy Gradient is a U-Statistic

Add code
Mar 03, 2026
Viaarxiv icon

Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text

Add code
Jan 29, 2026
Viaarxiv icon