Picture for Yunhao Tang

Yunhao Tang

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

Add code
Feb 09, 2023
Viaarxiv icon

An Analysis of Quantile Temporal-Difference Learning

Add code
Jan 11, 2023
Figure 1 for An Analysis of Quantile Temporal-Difference Learning
Figure 2 for An Analysis of Quantile Temporal-Difference Learning
Figure 3 for An Analysis of Quantile Temporal-Difference Learning
Figure 4 for An Analysis of Quantile Temporal-Difference Learning
Viaarxiv icon

Understanding Self-Predictive Learning for Reinforcement Learning

Add code
Dec 06, 2022
Figure 1 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 2 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 3 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 4 for Understanding Self-Predictive Learning for Reinforcement Learning
Viaarxiv icon

The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning

Add code
Jul 15, 2022
Figure 1 for The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Figure 2 for The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Figure 3 for The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Figure 4 for The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Viaarxiv icon

BYOL-Explore: Exploration by Bootstrapped Prediction

Add code
Jun 16, 2022
Figure 1 for BYOL-Explore: Exploration by Bootstrapped Prediction
Figure 2 for BYOL-Explore: Exploration by Bootstrapped Prediction
Figure 3 for BYOL-Explore: Exploration by Bootstrapped Prediction
Figure 4 for BYOL-Explore: Exploration by Bootstrapped Prediction
Viaarxiv icon

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal

Add code
May 27, 2022
Figure 1 for KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Figure 2 for KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Figure 3 for KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Figure 4 for KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Viaarxiv icon

From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses

Add code
May 16, 2022
Figure 1 for From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Figure 2 for From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Figure 3 for From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Figure 4 for From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Viaarxiv icon

Marginalized Operators for Off-policy Reinforcement Learning

Add code
Mar 30, 2022
Figure 1 for Marginalized Operators for Off-policy Reinforcement Learning
Figure 2 for Marginalized Operators for Off-policy Reinforcement Learning
Figure 3 for Marginalized Operators for Off-policy Reinforcement Learning
Figure 4 for Marginalized Operators for Off-policy Reinforcement Learning
Viaarxiv icon

Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning

Add code
Dec 14, 2021
Figure 1 for Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning
Figure 2 for Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning
Viaarxiv icon

Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation

Add code
Jun 24, 2021
Figure 1 for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Figure 2 for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Figure 3 for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Figure 4 for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Viaarxiv icon