Picture for Richard S. Sutton

Richard S. Sutton

Asynchronous Stochastic Approximation and Average-Reward Reinforcement Learning

Add code
Sep 05, 2024
Viaarxiv icon

On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes

Add code
Aug 29, 2024
Figure 1 for On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes
Figure 2 for On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes
Figure 3 for On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes
Figure 4 for On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes
Viaarxiv icon

An Idiosyncrasy of Time-discretization in Reinforcement Learning

Add code
Jun 21, 2024
Figure 1 for An Idiosyncrasy of Time-discretization in Reinforcement Learning
Figure 2 for An Idiosyncrasy of Time-discretization in Reinforcement Learning
Figure 3 for An Idiosyncrasy of Time-discretization in Reinforcement Learning
Figure 4 for An Idiosyncrasy of Time-discretization in Reinforcement Learning
Viaarxiv icon

Reward Centering

Add code
May 16, 2024
Viaarxiv icon

A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays

Add code
Dec 22, 2023
Viaarxiv icon

Iterative Option Discovery for Planning, by Planning

Add code
Oct 02, 2023
Figure 1 for Iterative Option Discovery for Planning, by Planning
Figure 2 for Iterative Option Discovery for Planning, by Planning
Figure 3 for Iterative Option Discovery for Planning, by Planning
Figure 4 for Iterative Option Discovery for Planning, by Planning
Viaarxiv icon

Value-aware Importance Weighting for Off-policy Reinforcement Learning

Add code
Jun 27, 2023
Figure 1 for Value-aware Importance Weighting for Off-policy Reinforcement Learning
Figure 2 for Value-aware Importance Weighting for Off-policy Reinforcement Learning
Figure 3 for Value-aware Importance Weighting for Off-policy Reinforcement Learning
Figure 4 for Value-aware Importance Weighting for Off-policy Reinforcement Learning
Viaarxiv icon

Maintaining Plasticity in Deep Continual Learning

Add code
Jun 23, 2023
Figure 1 for Maintaining Plasticity in Deep Continual Learning
Figure 2 for Maintaining Plasticity in Deep Continual Learning
Figure 3 for Maintaining Plasticity in Deep Continual Learning
Figure 4 for Maintaining Plasticity in Deep Continual Learning
Viaarxiv icon

On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs

Add code
Sep 30, 2022
Figure 1 for On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs
Figure 2 for On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs
Figure 3 for On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs
Figure 4 for On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs
Viaarxiv icon

The Alberta Plan for AI Research

Add code
Aug 23, 2022
Figure 1 for The Alberta Plan for AI Research
Figure 2 for The Alberta Plan for AI Research
Viaarxiv icon