Picture for Sara Rajaram

Sara Rajaram

Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning

Add code
Jun 14, 2025
Viaarxiv icon