Picture for Michael I. Jordan

Michael I. Jordan

Transferred Q-learning

Add code
Feb 09, 2022
Viaarxiv icon

Robust Estimation for Nonparametric Families via Generative Adversarial Networks

Add code
Feb 02, 2022
Viaarxiv icon

Reinforcement Learning with Heterogeneous Data: Estimation and Inference

Add code
Jan 31, 2022
Figure 1 for Reinforcement Learning with Heterogeneous Data: Estimation and Inference
Figure 2 for Reinforcement Learning with Heterogeneous Data: Estimation and Inference
Figure 3 for Reinforcement Learning with Heterogeneous Data: Estimation and Inference
Figure 4 for Reinforcement Learning with Heterogeneous Data: Estimation and Inference
Viaarxiv icon

Online Active Learning with Dynamic Marginal Gain Thresholding

Add code
Jan 25, 2022
Figure 1 for Online Active Learning with Dynamic Marginal Gain Thresholding
Figure 2 for Online Active Learning with Dynamic Marginal Gain Thresholding
Figure 3 for Online Active Learning with Dynamic Marginal Gain Thresholding
Figure 4 for Online Active Learning with Dynamic Marginal Gain Thresholding
Viaarxiv icon

Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector Problems

Add code
Jan 24, 2022
Figure 1 for Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector Problems
Figure 2 for Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector Problems
Figure 3 for Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector Problems
Viaarxiv icon

Polyak-Ruppert Averaged Q-Leaning is Statistically Efficient

Add code
Jan 23, 2022
Figure 1 for Polyak-Ruppert Averaged Q-Leaning is Statistically Efficient
Viaarxiv icon

Instance-Dependent Confidence and Early Stopping for Reinforcement Learning

Add code
Jan 21, 2022
Figure 1 for Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
Figure 2 for Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
Figure 3 for Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
Figure 4 for Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
Viaarxiv icon

Optimal variance-reduced stochastic approximation in Banach spaces

Add code
Jan 21, 2022
Viaarxiv icon

Last-Iterate Convergence of Saddle Point Optimizers via High-Resolution Differential Equations

Add code
Dec 27, 2021
Figure 1 for Last-Iterate Convergence of Saddle Point Optimizers via High-Resolution Differential Equations
Figure 2 for Last-Iterate Convergence of Saddle Point Optimizers via High-Resolution Differential Equations
Figure 3 for Last-Iterate Convergence of Saddle Point Optimizers via High-Resolution Differential Equations
Figure 4 for Last-Iterate Convergence of Saddle Point Optimizers via High-Resolution Differential Equations
Viaarxiv icon

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

Add code
Dec 27, 2021
Viaarxiv icon