Picture for Zhaoran Wang

Zhaoran Wang

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization

Add code
May 30, 2023
Figure 1 for What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization
Viaarxiv icon

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

Add code
May 29, 2023
Figure 1 for One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration
Figure 2 for One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration
Figure 3 for One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration
Figure 4 for One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration
Viaarxiv icon

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Add code
May 08, 2023
Figure 1 for Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
Viaarxiv icon

Dynamic Datasets and Market Environments for Financial Reinforcement Learning

Add code
Apr 25, 2023
Figure 1 for Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Figure 2 for Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Figure 3 for Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Figure 4 for Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Viaarxiv icon

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

Add code
Mar 28, 2023
Figure 1 for Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Figure 2 for Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Figure 3 for Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Figure 4 for Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Viaarxiv icon

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations

Add code
Mar 20, 2023
Viaarxiv icon

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning

Add code
Feb 24, 2023
Figure 1 for Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning
Figure 2 for Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning
Viaarxiv icon

Differentiable Arbitrating in Zero-sum Markov Games

Add code
Feb 20, 2023
Figure 1 for Differentiable Arbitrating in Zero-sum Markov Games
Figure 2 for Differentiable Arbitrating in Zero-sum Markov Games
Figure 3 for Differentiable Arbitrating in Zero-sum Markov Games
Figure 4 for Differentiable Arbitrating in Zero-sum Markov Games
Viaarxiv icon

An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models

Add code
Dec 30, 2022
Figure 1 for An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Figure 2 for An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Figure 3 for An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Figure 4 for An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Viaarxiv icon

Offline Policy Optimization in RL with Variance Regularizaton

Add code
Dec 29, 2022
Figure 1 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 2 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 3 for Offline Policy Optimization in RL with Variance Regularizaton
Figure 4 for Offline Policy Optimization in RL with Variance Regularizaton
Viaarxiv icon