Alert button
Picture for Shalabh Bhatnagar

Shalabh Bhatnagar

Alert button

Critic-Actor for Average Reward MDPs with Function Approximation: A Finite-Time Analysis

Add code
Bookmark button
Alert button
Feb 02, 2024
Prashansa Panda, Shalabh Bhatnagar

Viaarxiv icon

Approximate Linear Programming and Decentralized Policy Improvement in Cooperative Multi-agent Markov Decision Processes

Add code
Bookmark button
Alert button
Nov 20, 2023
Lakshmi Mandal, Chandrashekar Lakshminarayanan, Shalabh Bhatnagar

Viaarxiv icon

Finite Time Analysis of Constrained Actor Critic and Constrained Natural Actor Critic Algorithms

Add code
Bookmark button
Alert button
Oct 25, 2023
Prashansa Panda, Shalabh Bhatnagar

Viaarxiv icon

The Reinforce Policy Gradient Algorithm Revisited

Add code
Bookmark button
Alert button
Oct 08, 2023
Shalabh Bhatnagar

Viaarxiv icon

Off-Policy Average Reward Actor-Critic with Deterministic Policy Search

Add code
Bookmark button
Alert button
May 20, 2023
Naman Saxena, Subhojyoti Khastigir, Shishir Kolathaya, Shalabh Bhatnagar

Figure 1 for Off-Policy Average Reward Actor-Critic with Deterministic Policy Search
Figure 2 for Off-Policy Average Reward Actor-Critic with Deterministic Policy Search
Viaarxiv icon

A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks

Add code
Bookmark button
Alert button
May 20, 2023
Arunselvan Ramaswamy, Shalabh Bhatnagar, Naman Saxena

Figure 1 for A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks
Figure 2 for A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks
Figure 3 for A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks
Figure 4 for A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks
Viaarxiv icon

A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning

Add code
Bookmark button
Alert button
Apr 21, 2023
Mizhaan Prajit Maniyar, Akash Mondal, Prashanth L. A., Shalabh Bhatnagar

Viaarxiv icon

n-Step Temporal Difference Learning with Optimal n

Add code
Bookmark button
Alert button
Mar 13, 2023
Lakshmi Mandal, Shalabh Bhatnagar

Figure 1 for n-Step Temporal Difference Learning with Optimal n
Figure 2 for n-Step Temporal Difference Learning with Optimal n
Figure 3 for n-Step Temporal Difference Learning with Optimal n
Figure 4 for n-Step Temporal Difference Learning with Optimal n
Viaarxiv icon

Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias

Add code
Bookmark button
Alert button
Dec 20, 2022
Shalabh Bhatnagar, Prashanth L. A

Viaarxiv icon

Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm

Add code
Bookmark button
Alert button
Oct 14, 2022
Ashish Kumar Jayant, Shalabh Bhatnagar

Figure 1 for Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
Figure 2 for Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
Figure 3 for Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
Figure 4 for Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
Viaarxiv icon