Picture for David Zhu

David Zhu

Optimal Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning

Add code
Jun 14, 2024
Viaarxiv icon