Picture for Sinong Zhan

Sinong Zhan

Token Buncher: Shielding LLMs from Harmful Reinforcement Learning Fine-Tuning

Add code
Aug 28, 2025
Viaarxiv icon

Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning

Add code
Jul 23, 2025
Viaarxiv icon