Alert button
Picture for Shimon Whiteson

Shimon Whiteson

Alert button

Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients

Add code
Bookmark button
Alert button
May 06, 2021
Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson

Figure 1 for Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients
Figure 2 for Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients
Figure 3 for Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients
Figure 4 for Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients
Viaarxiv icon

Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 22, 2021
Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson

Figure 1 for Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning
Figure 2 for Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning
Figure 3 for Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning
Figure 4 for Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning
Viaarxiv icon

Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing

Add code
Bookmark button
Alert button
Mar 01, 2021
Charlie Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson

Figure 1 for Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing
Figure 2 for Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing
Figure 3 for Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing
Figure 4 for Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing
Viaarxiv icon

Breaking the Deadly Triad with a Target Network

Add code
Bookmark button
Alert button
Feb 09, 2021
Shangtong Zhang, Hengshuai Yao, Shimon Whiteson

Figure 1 for Breaking the Deadly Triad with a Target Network
Figure 2 for Breaking the Deadly Triad with a Target Network
Figure 3 for Breaking the Deadly Triad with a Target Network
Figure 4 for Breaking the Deadly Triad with a Target Network
Viaarxiv icon

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Add code
Bookmark button
Alert button
Jan 11, 2021
Luisa Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann

Figure 1 for Deep Interactive Bayesian Reinforcement Learning via Meta-Learning
Figure 2 for Deep Interactive Bayesian Reinforcement Learning via Meta-Learning
Figure 3 for Deep Interactive Bayesian Reinforcement Learning via Meta-Learning
Figure 4 for Deep Interactive Bayesian Reinforcement Learning via Meta-Learning
Viaarxiv icon

Average-Reward Off-Policy Policy Evaluation with Function Approximation

Add code
Bookmark button
Alert button
Jan 08, 2021
Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson

Figure 1 for Average-Reward Off-Policy Policy Evaluation with Function Approximation
Figure 2 for Average-Reward Off-Policy Policy Evaluation with Function Approximation
Figure 3 for Average-Reward Off-Policy Policy Evaluation with Function Approximation
Figure 4 for Average-Reward Off-Policy Policy Evaluation with Function Approximation
Viaarxiv icon

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

Add code
Bookmark button
Alert button
Nov 18, 2020
Christian Schroeder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, Shimon Whiteson

Figure 1 for Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?
Figure 2 for Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?
Figure 3 for Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?
Figure 4 for Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?
Viaarxiv icon