Picture for Zhou Ziheng

Zhou Ziheng

University of California, Los Angeles

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Add code
Aug 07, 2025
Viaarxiv icon

Aligner: One Global Token is Worth Millions of Parameters When Aligning Large Language Models

Add code
Dec 09, 2023
Viaarxiv icon