Picture for Edwin Hamel-De le Court

Edwin Hamel-De le Court

Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning

Add code
Nov 13, 2025
Viaarxiv icon

Probabilistic Shielding for Safe Reinforcement Learning

Add code
Mar 09, 2025
Viaarxiv icon