Picture for Satinder Singh

Satinder Singh

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Add code
Jun 30, 2022
Figure 1 for Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Figure 2 for Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Figure 3 for Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Figure 4 for Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Viaarxiv icon

Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality

Add code
May 26, 2022
Figure 1 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 2 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 3 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 4 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Viaarxiv icon

GrASP: Gradient-Based Affordance Selection for Planning

Add code
Feb 08, 2022
Figure 1 for GrASP: Gradient-Based Affordance Selection for Planning
Figure 2 for GrASP: Gradient-Based Affordance Selection for Planning
Figure 3 for GrASP: Gradient-Based Affordance Selection for Planning
Figure 4 for GrASP: Gradient-Based Affordance Selection for Planning
Viaarxiv icon

On the Expressivity of Markov Reward

Add code
Nov 01, 2021
Figure 1 for On the Expressivity of Markov Reward
Figure 2 for On the Expressivity of Markov Reward
Figure 3 for On the Expressivity of Markov Reward
Figure 4 for On the Expressivity of Markov Reward
Viaarxiv icon

Learning to Learn End-to-End Goal-Oriented Dialog From Related Dialog Tasks

Add code
Oct 10, 2021
Figure 1 for Learning to Learn End-to-End Goal-Oriented Dialog From Related Dialog Tasks
Figure 2 for Learning to Learn End-to-End Goal-Oriented Dialog From Related Dialog Tasks
Figure 3 for Learning to Learn End-to-End Goal-Oriented Dialog From Related Dialog Tasks
Figure 4 for Learning to Learn End-to-End Goal-Oriented Dialog From Related Dialog Tasks
Viaarxiv icon

Bootstrapped Meta-Learning

Add code
Sep 09, 2021
Figure 1 for Bootstrapped Meta-Learning
Figure 2 for Bootstrapped Meta-Learning
Figure 3 for Bootstrapped Meta-Learning
Figure 4 for Bootstrapped Meta-Learning
Viaarxiv icon

Proper Value Equivalence

Add code
Jun 18, 2021
Figure 1 for Proper Value Equivalence
Figure 2 for Proper Value Equivalence
Figure 3 for Proper Value Equivalence
Figure 4 for Proper Value Equivalence
Viaarxiv icon

Discovering Diverse Nearly Optimal Policies withSuccessor Features

Add code
Jun 01, 2021
Figure 1 for Discovering Diverse Nearly Optimal Policies withSuccessor Features
Figure 2 for Discovering Diverse Nearly Optimal Policies withSuccessor Features
Figure 3 for Discovering Diverse Nearly Optimal Policies withSuccessor Features
Figure 4 for Discovering Diverse Nearly Optimal Policies withSuccessor Features
Viaarxiv icon

Reward is enough for convex MDPs

Add code
Jun 01, 2021
Figure 1 for Reward is enough for convex MDPs
Figure 2 for Reward is enough for convex MDPs
Viaarxiv icon

Reinforcement Learning of Implicit and Explicit Control Flow in Instructions

Add code
Feb 25, 2021
Figure 1 for Reinforcement Learning of Implicit and Explicit Control Flow in Instructions
Figure 2 for Reinforcement Learning of Implicit and Explicit Control Flow in Instructions
Figure 3 for Reinforcement Learning of Implicit and Explicit Control Flow in Instructions
Figure 4 for Reinforcement Learning of Implicit and Explicit Control Flow in Instructions
Viaarxiv icon