Alert button
Picture for Mengdi Wang

Mengdi Wang

Alert button

Bootstrapping Statistical Inference for Off-Policy Evaluation

Add code
Bookmark button
Alert button
Feb 09, 2021
Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvári, Mengdi Wang

Figure 1 for Bootstrapping Statistical Inference for Off-Policy Evaluation
Figure 2 for Bootstrapping Statistical Inference for Off-Policy Evaluation
Figure 3 for Bootstrapping Statistical Inference for Off-Policy Evaluation
Figure 4 for Bootstrapping Statistical Inference for Off-Policy Evaluation
Viaarxiv icon

Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations

Add code
Bookmark button
Alert button
Nov 09, 2020
Zhuoran Yang, Chi Jin, Zhaoran Wang, Mengdi Wang, Michael I. Jordan

Figure 1 for Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations
Viaarxiv icon

High-Dimensional Sparse Linear Bandits

Add code
Bookmark button
Alert button
Nov 08, 2020
Botao Hao, Tor Lattimore, Mengdi Wang

Figure 1 for High-Dimensional Sparse Linear Bandits
Figure 2 for High-Dimensional Sparse Linear Bandits
Viaarxiv icon

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient

Add code
Bookmark button
Alert button
Nov 08, 2020
Botao Hao, Yaqi Duan, Tor Lattimore, Csaba Szepesvári, Mengdi Wang

Viaarxiv icon

Online Sparse Reinforcement Learning

Add code
Bookmark button
Alert button
Nov 08, 2020
Botao Hao, Tor Lattimore, Csaba Szepesvári, Mengdi Wang

Figure 1 for Online Sparse Reinforcement Learning
Viaarxiv icon

Generalized Leverage Score Sampling for Neural Networks

Add code
Bookmark button
Alert button
Sep 21, 2020
Jason D. Lee, Ruoqi Shen, Zhao Song, Mengdi Wang, Zheng Yu

Figure 1 for Generalized Leverage Score Sampling for Neural Networks
Figure 2 for Generalized Leverage Score Sampling for Neural Networks
Figure 3 for Generalized Leverage Score Sampling for Neural Networks
Viaarxiv icon

Variational Policy Gradient Method for Reinforcement Learning with General Utilities

Add code
Bookmark button
Alert button
Jul 04, 2020
Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvari, Mengdi Wang

Figure 1 for Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Figure 2 for Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Figure 3 for Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Viaarxiv icon

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python

Add code
Bookmark button
Alert button
Jun 27, 2020
Jason Ge, Xingguo Li, Haoming Jiang, Han Liu, Tong Zhang, Mengdi Wang, Tuo Zhao

Figure 1 for Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python
Figure 2 for Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python
Figure 3 for Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python
Viaarxiv icon

Model-Based Reinforcement Learning with Value-Targeted Regression

Add code
Bookmark button
Alert button
Jun 01, 2020
Alex Ayoub, Zeyu Jia, Csaba Szepesvari, Mengdi Wang, Lin F. Yang

Figure 1 for Model-Based Reinforcement Learning with Value-Targeted Regression
Figure 2 for Model-Based Reinforcement Learning with Value-Targeted Regression
Figure 3 for Model-Based Reinforcement Learning with Value-Targeted Regression
Figure 4 for Model-Based Reinforcement Learning with Value-Targeted Regression
Viaarxiv icon

Cautious Reinforcement Learning via Distributional Risk in the Dual Domain

Add code
Bookmark button
Alert button
Feb 27, 2020
Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel

Figure 1 for Cautious Reinforcement Learning via Distributional Risk in the Dual Domain
Figure 2 for Cautious Reinforcement Learning via Distributional Risk in the Dual Domain
Figure 3 for Cautious Reinforcement Learning via Distributional Risk in the Dual Domain
Figure 4 for Cautious Reinforcement Learning via Distributional Risk in the Dual Domain
Viaarxiv icon