Picture for Gugan Thoppe

Gugan Thoppe

Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning

Add code
Jun 08, 2025
Viaarxiv icon

Reinforcement Learning with Quasi-Hyperbolic Discounting

Add code
Sep 16, 2024
Figure 1 for Reinforcement Learning with Quasi-Hyperbolic Discounting
Figure 2 for Reinforcement Learning with Quasi-Hyperbolic Discounting
Figure 3 for Reinforcement Learning with Quasi-Hyperbolic Discounting
Viaarxiv icon

Online Learning of Weakly Coupled MDP Policies for Load Balancing and Auto Scaling

Add code
Jun 20, 2024
Figure 1 for Online Learning of Weakly Coupled MDP Policies for Load Balancing and Auto Scaling
Viaarxiv icon

Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries

Add code
Mar 15, 2024
Viaarxiv icon

VaR\ and CVaR Estimation in a Markov Cost Process: Lower and Upper Bounds

Add code
Oct 17, 2023
Viaarxiv icon

Online Learning with Adversaries: A Differential Inclusion Analysis

Add code
Apr 04, 2023
Viaarxiv icon

SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search

Add code
Jan 30, 2023
Viaarxiv icon

Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking

Add code
Aug 22, 2022
Figure 1 for Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking
Figure 2 for Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking
Figure 3 for Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking
Figure 4 for Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking
Viaarxiv icon

Approximate Q-learning and SARSA under the $ε$-greedy Policy: a Differential Inclusion Analysis

Add code
May 26, 2022
Figure 1 for Approximate Q-learning and SARSA under the $ε$-greedy Policy: a Differential Inclusion Analysis
Figure 2 for Approximate Q-learning and SARSA under the $ε$-greedy Policy: a Differential Inclusion Analysis
Figure 3 for Approximate Q-learning and SARSA under the $ε$-greedy Policy: a Differential Inclusion Analysis
Viaarxiv icon

A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning

Add code
Nov 10, 2021
Viaarxiv icon