Picture for Valeria Capretti

Valeria Capretti

Efficient Reinforcement Learning from Human Feedback via Bayesian Preference Inference

Add code
Nov 06, 2025
Viaarxiv icon