Picture for Quanquan Gu

Quanquan Gu

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes

Add code
Dec 12, 2022
Viaarxiv icon

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

Add code
Dec 12, 2022
Figure 1 for Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Viaarxiv icon

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Add code
Sep 30, 2022
Figure 1 for A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Figure 2 for A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Viaarxiv icon

Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium

Add code
Aug 10, 2022
Viaarxiv icon

Towards Understanding Mixture of Experts in Deep Learning

Add code
Aug 04, 2022
Figure 1 for Towards Understanding Mixture of Experts in Deep Learning
Figure 2 for Towards Understanding Mixture of Experts in Deep Learning
Figure 3 for Towards Understanding Mixture of Experts in Deep Learning
Figure 4 for Towards Understanding Mixture of Experts in Deep Learning
Viaarxiv icon

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift

Add code
Aug 03, 2022
Figure 1 for The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift
Figure 2 for The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift
Viaarxiv icon

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

Add code
Jul 07, 2022
Figure 1 for A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits
Figure 2 for A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits
Figure 3 for A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits
Viaarxiv icon

Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs

Add code
May 23, 2022
Figure 1 for Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs
Viaarxiv icon

Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions

Add code
May 13, 2022
Figure 1 for Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions
Viaarxiv icon

On the Convergence of Certified Robust Training with Interval Bound Propagation

Add code
Mar 16, 2022
Figure 1 for On the Convergence of Certified Robust Training with Interval Bound Propagation
Viaarxiv icon