Picture for Tom Zahavy

Tom Zahavy

Diversifying AI: Towards Creative Chess with AlphaZero

Add code
Aug 29, 2023
Figure 1 for Diversifying AI: Towards Creative Chess with AlphaZero
Figure 2 for Diversifying AI: Towards Creative Chess with AlphaZero
Figure 3 for Diversifying AI: Towards Creative Chess with AlphaZero
Figure 4 for Diversifying AI: Towards Creative Chess with AlphaZero
Viaarxiv icon

APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT

Add code
Aug 24, 2023
Figure 1 for APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Figure 2 for APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Figure 3 for APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Figure 4 for APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Viaarxiv icon

Optimism and Adaptivity in Policy Optimization

Add code
Jun 18, 2023
Figure 1 for Optimism and Adaptivity in Policy Optimization
Figure 2 for Optimism and Adaptivity in Policy Optimization
Figure 3 for Optimism and Adaptivity in Policy Optimization
Figure 4 for Optimism and Adaptivity in Policy Optimization
Viaarxiv icon

Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization

Add code
Apr 08, 2023
Figure 1 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 2 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 3 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 4 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Viaarxiv icon

ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs

Add code
Feb 02, 2023
Figure 1 for ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Figure 2 for ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Figure 3 for ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Figure 4 for ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Viaarxiv icon

Optimistic Meta-Gradients

Add code
Jan 09, 2023
Figure 1 for Optimistic Meta-Gradients
Figure 2 for Optimistic Meta-Gradients
Figure 3 for Optimistic Meta-Gradients
Figure 4 for Optimistic Meta-Gradients
Viaarxiv icon

POMRL: No-Regret Learning-to-Plan with Increasing Horizons

Add code
Dec 30, 2022
Figure 1 for POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Figure 2 for POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Figure 3 for POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Figure 4 for POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Viaarxiv icon

Discovering Evolution Strategies via Meta-Black-Box Optimization

Add code
Nov 25, 2022
Figure 1 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 2 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 3 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 4 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Viaarxiv icon

Meta-Gradients in Non-Stationary Environments

Add code
Sep 13, 2022
Figure 1 for Meta-Gradients in Non-Stationary Environments
Figure 2 for Meta-Gradients in Non-Stationary Environments
Figure 3 for Meta-Gradients in Non-Stationary Environments
Figure 4 for Meta-Gradients in Non-Stationary Environments
Viaarxiv icon

Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality

Add code
May 26, 2022
Figure 1 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 2 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 3 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 4 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Viaarxiv icon