Picture for Yuhuai Wu

Yuhuai Wu

INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving

Add code
Jul 06, 2020
Figure 1 for INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving
Figure 2 for INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving
Figure 3 for INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving
Figure 4 for INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving
Viaarxiv icon

Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs

Add code
Jun 13, 2020
Figure 1 for Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs
Figure 2 for Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs
Figure 3 for Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs
Figure 4 for Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs
Viaarxiv icon

Options as responses: Grounding behavioural hierarchies in multi-agent RL

Add code
Jun 06, 2019
Figure 1 for Options as responses: Grounding behavioural hierarchies in multi-agent RL
Figure 2 for Options as responses: Grounding behavioural hierarchies in multi-agent RL
Figure 3 for Options as responses: Grounding behavioural hierarchies in multi-agent RL
Figure 4 for Options as responses: Grounding behavioural hierarchies in multi-agent RL
Viaarxiv icon

Concurrent Meta Reinforcement Learning

Add code
Mar 07, 2019
Figure 1 for Concurrent Meta Reinforcement Learning
Figure 2 for Concurrent Meta Reinforcement Learning
Figure 3 for Concurrent Meta Reinforcement Learning
Figure 4 for Concurrent Meta Reinforcement Learning
Viaarxiv icon

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning

Add code
Feb 12, 2019
Figure 1 for ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning
Figure 2 for ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning
Figure 3 for ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning
Figure 4 for ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning
Viaarxiv icon

Understanding Short-Horizon Bias in Stochastic Meta-Optimization

Add code
Mar 06, 2018
Figure 1 for Understanding Short-Horizon Bias in Stochastic Meta-Optimization
Figure 2 for Understanding Short-Horizon Bias in Stochastic Meta-Optimization
Figure 3 for Understanding Short-Horizon Bias in Stochastic Meta-Optimization
Figure 4 for Understanding Short-Horizon Bias in Stochastic Meta-Optimization
Viaarxiv icon

Some Considerations on Learning to Explore via Meta-Reinforcement Learning

Add code
Mar 03, 2018
Figure 1 for Some Considerations on Learning to Explore via Meta-Reinforcement Learning
Figure 2 for Some Considerations on Learning to Explore via Meta-Reinforcement Learning
Figure 3 for Some Considerations on Learning to Explore via Meta-Reinforcement Learning
Figure 4 for Some Considerations on Learning to Explore via Meta-Reinforcement Learning
Viaarxiv icon

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

Add code
Feb 23, 2018
Figure 1 for Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Figure 2 for Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Figure 3 for Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Figure 4 for Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Viaarxiv icon

An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients

Add code
Jan 17, 2018
Figure 1 for An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients
Figure 2 for An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients
Figure 3 for An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients
Figure 4 for An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients
Viaarxiv icon

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

Add code
Aug 18, 2017
Figure 1 for Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Figure 2 for Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Figure 3 for Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Figure 4 for Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Viaarxiv icon