Picture for Pankaj Kumar Barman

Pankaj Kumar Barman

Global Convergence of Average Reward Constrained MDPs with Neural Critic and General Policy Parameterization

Add code
Mar 08, 2026
Viaarxiv icon