Picture for Martha White

Martha White

Continual Auxiliary Task Learning

Add code
Feb 22, 2022
Figure 1 for Continual Auxiliary Task Learning
Figure 2 for Continual Auxiliary Task Learning
Figure 3 for Continual Auxiliary Task Learning
Figure 4 for Continual Auxiliary Task Learning
Viaarxiv icon

A Temporal-Difference Approach to Policy Gradient Estimation

Add code
Feb 04, 2022
Figure 1 for A Temporal-Difference Approach to Policy Gradient Estimation
Figure 2 for A Temporal-Difference Approach to Policy Gradient Estimation
Figure 3 for A Temporal-Difference Approach to Policy Gradient Estimation
Figure 4 for A Temporal-Difference Approach to Policy Gradient Estimation
Viaarxiv icon

An Alternate Policy Gradient Estimator for Softmax Policies

Add code
Dec 22, 2021
Figure 1 for An Alternate Policy Gradient Estimator for Softmax Policies
Figure 2 for An Alternate Policy Gradient Estimator for Softmax Policies
Figure 3 for An Alternate Policy Gradient Estimator for Softmax Policies
Figure 4 for An Alternate Policy Gradient Estimator for Softmax Policies
Viaarxiv icon

Understanding Feature Transfer Through Representation Alignment

Add code
Dec 15, 2021
Figure 1 for Understanding Feature Transfer Through Representation Alignment
Figure 2 for Understanding Feature Transfer Through Representation Alignment
Figure 3 for Understanding Feature Transfer Through Representation Alignment
Figure 4 for Understanding Feature Transfer Through Representation Alignment
Viaarxiv icon

Off-Policy Actor-Critic with Emphatic Weightings

Add code
Nov 16, 2021
Figure 1 for Off-Policy Actor-Critic with Emphatic Weightings
Figure 2 for Off-Policy Actor-Critic with Emphatic Weightings
Figure 3 for Off-Policy Actor-Critic with Emphatic Weightings
Figure 4 for Off-Policy Actor-Critic with Emphatic Weightings
Viaarxiv icon

Exploiting Action Impact Regularity and Partially Known Models for Offline Reinforcement Learning

Add code
Nov 15, 2021
Figure 1 for Exploiting Action Impact Regularity and Partially Known Models for Offline Reinforcement Learning
Figure 2 for Exploiting Action Impact Regularity and Partially Known Models for Offline Reinforcement Learning
Figure 3 for Exploiting Action Impact Regularity and Partially Known Models for Offline Reinforcement Learning
Viaarxiv icon

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences

Add code
Jul 17, 2021
Figure 1 for Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Figure 2 for Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Figure 3 for Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Figure 4 for Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Viaarxiv icon

Predictive Representation Learning for Language Modeling

Add code
May 29, 2021
Figure 1 for Predictive Representation Learning for Language Modeling
Figure 2 for Predictive Representation Learning for Language Modeling
Figure 3 for Predictive Representation Learning for Language Modeling
Figure 4 for Predictive Representation Learning for Language Modeling
Viaarxiv icon

A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning

Add code
Apr 28, 2021
Figure 1 for A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning
Figure 2 for A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning
Figure 3 for A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning
Figure 4 for A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning
Viaarxiv icon

Scalable Online Recurrent Learning Using Columnar Neural Networks

Add code
Mar 09, 2021
Figure 1 for Scalable Online Recurrent Learning Using Columnar Neural Networks
Figure 2 for Scalable Online Recurrent Learning Using Columnar Neural Networks
Figure 3 for Scalable Online Recurrent Learning Using Columnar Neural Networks
Figure 4 for Scalable Online Recurrent Learning Using Columnar Neural Networks
Viaarxiv icon