Picture for Yechen Zhang

Yechen Zhang

Mousse: Rectifying the Geometry of Muon with Curvature-Aware Preconditioning

Add code
Mar 10, 2026
Viaarxiv icon

How to Set the Batch Size for Large-Scale Pre-training?

Add code
Jan 08, 2026
Viaarxiv icon

Soft Decomposed Policy-Critic: Bridging the Gap for Effective Continuous Control with Discrete RL

Add code
Aug 20, 2023
Viaarxiv icon