Alert button
Picture for Shixiang Gu

Shixiang Gu

Alert button

Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives

Add code
Bookmark button
Alert button
Oct 09, 2018
George Tucker, Dieterich Lawson, Shixiang Gu, Chris J. Maddison

Figure 1 for Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives
Figure 2 for Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives
Figure 3 for Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives
Figure 4 for Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives
Viaarxiv icon

Data-Efficient Hierarchical Reinforcement Learning

Add code
Bookmark button
Alert button
Oct 05, 2018
Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine

Figure 1 for Data-Efficient Hierarchical Reinforcement Learning
Figure 2 for Data-Efficient Hierarchical Reinforcement Learning
Figure 3 for Data-Efficient Hierarchical Reinforcement Learning
Figure 4 for Data-Efficient Hierarchical Reinforcement Learning
Viaarxiv icon

Near-Optimal Representation Learning for Hierarchical Reinforcement Learning

Add code
Bookmark button
Alert button
Oct 02, 2018
Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine

Figure 1 for Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
Figure 2 for Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
Figure 3 for Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
Figure 4 for Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
Viaarxiv icon

The Mirage of Action-Dependent Baselines in Reinforcement Learning

Add code
Bookmark button
Alert button
Apr 06, 2018
George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine

Figure 1 for The Mirage of Action-Dependent Baselines in Reinforcement Learning
Figure 2 for The Mirage of Action-Dependent Baselines in Reinforcement Learning
Figure 3 for The Mirage of Action-Dependent Baselines in Reinforcement Learning
Figure 4 for The Mirage of Action-Dependent Baselines in Reinforcement Learning
Viaarxiv icon

Temporal Difference Models: Model-Free Deep RL for Model-Based Control

Add code
Bookmark button
Alert button
Feb 25, 2018
Vitchyr Pong, Shixiang Gu, Murtaza Dalal, Sergey Levine

Figure 1 for Temporal Difference Models: Model-Free Deep RL for Model-Based Control
Figure 2 for Temporal Difference Models: Model-Free Deep RL for Model-Based Control
Figure 3 for Temporal Difference Models: Model-Free Deep RL for Model-Based Control
Figure 4 for Temporal Difference Models: Model-Free Deep RL for Model-Based Control
Viaarxiv icon

Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning

Add code
Bookmark button
Alert button
Nov 18, 2017
Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine

Figure 1 for Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Figure 2 for Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Figure 3 for Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Figure 4 for Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Viaarxiv icon

Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control

Add code
Bookmark button
Alert button
Oct 16, 2017
Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E. Turner, Douglas Eck

Figure 1 for Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control
Figure 2 for Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control
Figure 3 for Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control
Figure 4 for Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control
Viaarxiv icon

Categorical Reparameterization with Gumbel-Softmax

Add code
Bookmark button
Alert button
Aug 05, 2017
Eric Jang, Shixiang Gu, Ben Poole

Figure 1 for Categorical Reparameterization with Gumbel-Softmax
Figure 2 for Categorical Reparameterization with Gumbel-Softmax
Figure 3 for Categorical Reparameterization with Gumbel-Softmax
Figure 4 for Categorical Reparameterization with Gumbel-Softmax
Viaarxiv icon

Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

Add code
Bookmark button
Alert button
Jun 01, 2017
Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine

Figure 1 for Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
Figure 2 for Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
Figure 3 for Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
Figure 4 for Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
Viaarxiv icon

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

Add code
Bookmark button
Alert button
Feb 27, 2017
Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine

Figure 1 for Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
Figure 2 for Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
Figure 3 for Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
Figure 4 for Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
Viaarxiv icon