Alert button
Picture for Marc Lanctot

Marc Lanctot

Alert button

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

Add code
Bookmark button
Alert button
Jun 17, 2021
Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Grapael

Figure 1 for Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
Figure 2 for Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
Figure 3 for Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
Figure 4 for Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
Viaarxiv icon

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

Add code
Bookmark button
Alert button
Feb 13, 2021
Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy Greenwald

Figure 1 for Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games
Figure 2 for Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games
Figure 3 for Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games
Figure 4 for Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games
Viaarxiv icon

Solving Common-Payoff Games with Approximate Policy Iteration

Add code
Bookmark button
Alert button
Jan 11, 2021
Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

Figure 1 for Solving Common-Payoff Games with Approximate Policy Iteration
Figure 2 for Solving Common-Payoff Games with Approximate Policy Iteration
Figure 3 for Solving Common-Payoff Games with Approximate Policy Iteration
Figure 4 for Solving Common-Payoff Games with Approximate Policy Iteration
Viaarxiv icon

Hindsight and Sequential Rationality of Correlated Play

Add code
Bookmark button
Alert button
Dec 17, 2020
Dustin Morrill, Ryan D'Orazio, Reca Sarfati, Marc Lanctot, James R. Wright, Amy Greenwald, Michael Bowling

Figure 1 for Hindsight and Sequential Rationality of Correlated Play
Figure 2 for Hindsight and Sequential Rationality of Correlated Play
Figure 3 for Hindsight and Sequential Rationality of Correlated Play
Figure 4 for Hindsight and Sequential Rationality of Correlated Play
Viaarxiv icon

Negotiating Team Formation Using Deep Reinforcement Learning

Add code
Bookmark button
Alert button
Oct 20, 2020
Yoram Bachrach, Richard Everett, Edward Hughes, Angeliki Lazaridou, Joel Z. Leibo, Marc Lanctot, Michael Johanson, Wojciech M. Czarnecki, Thore Graepel

Figure 1 for Negotiating Team Formation Using Deep Reinforcement Learning
Figure 2 for Negotiating Team Formation Using Deep Reinforcement Learning
Figure 3 for Negotiating Team Formation Using Deep Reinforcement Learning
Figure 4 for Negotiating Team Formation Using Deep Reinforcement Learning
Viaarxiv icon

The Advantage Regret-Matching Actor-Critic

Add code
Bookmark button
Alert button
Aug 27, 2020
Audrūnas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Perolat, Dustin Morrill, Vinicius Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls

Figure 1 for The Advantage Regret-Matching Actor-Critic
Figure 2 for The Advantage Regret-Matching Actor-Critic
Figure 3 for The Advantage Regret-Matching Actor-Critic
Figure 4 for The Advantage Regret-Matching Actor-Critic
Viaarxiv icon

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Add code
Bookmark button
Alert button
Jun 17, 2020
Thomas Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach

Figure 1 for Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Figure 2 for Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Figure 3 for Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Figure 4 for Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Viaarxiv icon

Approximate exploitability: Learning a best response in large games

Add code
Bookmark button
Alert button
Apr 20, 2020
Finbarr Timbers, Edward Lockhart, Martin Schmid, Marc Lanctot, Michael Bowling

Figure 1 for Approximate exploitability: Learning a best response in large games
Figure 2 for Approximate exploitability: Learning a best response in large games
Figure 3 for Approximate exploitability: Learning a best response in large games
Figure 4 for Approximate exploitability: Learning a best response in large games
Viaarxiv icon