Picture for Ankit Kanwar

Ankit Kanwar

Safety-Biased Policy Optimisation: Towards Hard-Constrained Reinforcement Learning via Trust Regions

Add code
Dec 29, 2025
Viaarxiv icon