Alert button
Picture for Huizhen Yu

Huizhen Yu

Alert button

A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays

Add code
Bookmark button
Alert button
Dec 22, 2023
Huizhen Yu, Yi Wan, Richard S. Sutton

Viaarxiv icon

Two geometric input transformation methods for fast online reinforcement learning with neural nets

Add code
Bookmark button
Alert button
Sep 06, 2018
Sina Ghiassian, Huizhen Yu, Banafsheh Rafiee, Richard S. Sutton

Figure 1 for Two geometric input transformation methods for fast online reinforcement learning with neural nets
Figure 2 for Two geometric input transformation methods for fast online reinforcement learning with neural nets
Figure 3 for Two geometric input transformation methods for fast online reinforcement learning with neural nets
Figure 4 for Two geometric input transformation methods for fast online reinforcement learning with neural nets
Viaarxiv icon

On Convergence of some Gradient-based Temporal-Differences Algorithms for Off-Policy Learning

Add code
Bookmark button
Alert button
Mar 28, 2018
Huizhen Yu

Viaarxiv icon

On Convergence of Emphatic Temporal-Difference Learning

Add code
Bookmark button
Alert button
Dec 28, 2017
Huizhen Yu

Figure 1 for On Convergence of Emphatic Temporal-Difference Learning
Viaarxiv icon

Multi-step Off-policy Learning Without Importance Sampling Ratios

Add code
Bookmark button
Alert button
Feb 09, 2017
Ashique Rupam Mahmood, Huizhen Yu, Richard S. Sutton

Figure 1 for Multi-step Off-policy Learning Without Importance Sampling Ratios
Figure 2 for Multi-step Off-policy Learning Without Importance Sampling Ratios
Figure 3 for Multi-step Off-policy Learning Without Importance Sampling Ratios
Figure 4 for Multi-step Off-policy Learning Without Importance Sampling Ratios
Viaarxiv icon

Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize

Add code
Bookmark button
Alert button
Jan 20, 2017
Huizhen Yu

Viaarxiv icon

Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms

Add code
Bookmark button
Alert button
May 06, 2016
Huizhen Yu

Figure 1 for Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms
Figure 2 for Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms
Figure 3 for Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms
Figure 4 for Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms
Viaarxiv icon

Emphatic Temporal-Difference Learning

Add code
Bookmark button
Alert button
Jul 06, 2015
A. Rupam Mahmood, Huizhen Yu, Martha White, Richard S. Sutton

Figure 1 for Emphatic Temporal-Difference Learning
Figure 2 for Emphatic Temporal-Difference Learning
Viaarxiv icon

Discretized Approximations for POMDP with Average Cost

Add code
Bookmark button
Alert button
Jul 11, 2012
Huizhen Yu, Dimitri Bertsekas

Figure 1 for Discretized Approximations for POMDP with Average Cost
Figure 2 for Discretized Approximations for POMDP with Average Cost
Figure 3 for Discretized Approximations for POMDP with Average Cost
Figure 4 for Discretized Approximations for POMDP with Average Cost
Viaarxiv icon

A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies

Add code
Bookmark button
Alert button
Jul 04, 2012
Huizhen Yu

Figure 1 for A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies
Figure 2 for A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies
Figure 3 for A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies
Viaarxiv icon