Alert button
Picture for Marc Lanctot

Marc Lanctot

Alert button

Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines

Add code
Bookmark button
Alert button
Sep 09, 2018
Martin Schmid, Neil Burch, Marc Lanctot, Matej Moravcik, Rudolf Kadlec, Michael Bowling

Figure 1 for Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Figure 2 for Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Figure 3 for Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Figure 4 for Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Viaarxiv icon

Emergent Communication through Negotiation

Add code
Bookmark button
Alert button
Apr 11, 2018
Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z Leibo, Karl Tuyls, Stephen Clark

Figure 1 for Emergent Communication through Negotiation
Figure 2 for Emergent Communication through Negotiation
Figure 3 for Emergent Communication through Negotiation
Figure 4 for Emergent Communication through Negotiation
Viaarxiv icon

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Add code
Bookmark button
Alert button
Dec 05, 2017
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis

Figure 1 for Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Figure 2 for Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Figure 3 for Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Figure 4 for Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Viaarxiv icon

Deep Q-learning from Demonstrations

Add code
Bookmark button
Alert button
Nov 22, 2017
Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys

Figure 1 for Deep Q-learning from Demonstrations
Figure 2 for Deep Q-learning from Demonstrations
Figure 3 for Deep Q-learning from Demonstrations
Viaarxiv icon

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

Add code
Bookmark button
Alert button
Nov 07, 2017
Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, Thore Graepel

Figure 1 for A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
Figure 2 for A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
Figure 3 for A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
Figure 4 for A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
Viaarxiv icon

Value-Decomposition Networks For Cooperative Multi-Agent Learning

Add code
Bookmark button
Alert button
Jun 16, 2017
Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel

Figure 1 for Value-Decomposition Networks For Cooperative Multi-Agent Learning
Figure 2 for Value-Decomposition Networks For Cooperative Multi-Agent Learning
Viaarxiv icon

Memory-Efficient Backpropagation Through Time

Add code
Bookmark button
Alert button
Jun 10, 2016
Audrūnas Gruslys, Remi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves

Figure 1 for Memory-Efficient Backpropagation Through Time
Figure 2 for Memory-Efficient Backpropagation Through Time
Figure 3 for Memory-Efficient Backpropagation Through Time
Figure 4 for Memory-Efficient Backpropagation Through Time
Viaarxiv icon

Convolution by Evolution: Differentiable Pattern Producing Networks

Add code
Bookmark button
Alert button
Jun 08, 2016
Chrisantha Fernando, Dylan Banarse, Malcolm Reynolds, Frederic Besse, David Pfau, Max Jaderberg, Marc Lanctot, Daan Wierstra

Figure 1 for Convolution by Evolution: Differentiable Pattern Producing Networks
Figure 2 for Convolution by Evolution: Differentiable Pattern Producing Networks
Figure 3 for Convolution by Evolution: Differentiable Pattern Producing Networks
Figure 4 for Convolution by Evolution: Differentiable Pattern Producing Networks
Viaarxiv icon

Dueling Network Architectures for Deep Reinforcement Learning

Add code
Bookmark button
Alert button
Apr 05, 2016
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas

Figure 1 for Dueling Network Architectures for Deep Reinforcement Learning
Figure 2 for Dueling Network Architectures for Deep Reinforcement Learning
Figure 3 for Dueling Network Architectures for Deep Reinforcement Learning
Figure 4 for Dueling Network Architectures for Deep Reinforcement Learning
Viaarxiv icon

Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups

Add code
Bookmark button
Alert button
Jun 19, 2014
Marc Lanctot, Mark H. M. Winands, Tom Pepels, Nathan R. Sturtevant

Figure 1 for Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Figure 2 for Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Figure 3 for Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Figure 4 for Monte Carlo Tree Search with Heuristic Evaluations using Implicit Minimax Backups
Viaarxiv icon