Alert button
Picture for Mengdi Wang

Mengdi Wang

Alert button

Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds

Sep 25, 2023
Zhenghao Xu, Xiang Ji, Minshuo Chen, Mengdi Wang, Tuo Zhao

Viaarxiv icon

Deep Reinforcement Learning for Efficient and Fair Allocation of Health Care Resources

Sep 15, 2023
Yikuan Li, Chengsheng Mao, Kaixuan Huang, Hanyin Wang, Zheng Yu, Mengdi Wang, Yuan Luo

Viaarxiv icon

Aligning Agent Policy with Externalities: Reward Design via Bilevel RL

Aug 03, 2023
Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Dinesh Manocha, Huazheng Wang, Furong Huang, Mengdi Wang

Figure 1 for Aligning Agent Policy with Externalities: Reward Design via Bilevel RL
Figure 2 for Aligning Agent Policy with Externalities: Reward Design via Bilevel RL
Figure 3 for Aligning Agent Policy with Externalities: Reward Design via Bilevel RL
Figure 4 for Aligning Agent Policy with Externalities: Reward Design via Bilevel RL
Viaarxiv icon

Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks

Jul 26, 2023
Siyu Chen, Mengdi Wang, Zhuoran Yang

Figure 1 for Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks
Figure 2 for Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks
Figure 3 for Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks
Viaarxiv icon

Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems

Jul 24, 2023
Xiang Ji, Huazheng Wang, Minshuo Chen, Tuo Zhao, Mengdi Wang

Viaarxiv icon

Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

Jul 13, 2023
Hui Yuan, Kaixuan Huang, Chengzhuo Ni, Minshuo Chen, Mengdi Wang

Figure 1 for Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
Figure 2 for Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
Figure 3 for Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
Figure 4 for Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
Viaarxiv icon

Sample-Efficient Learning of POMDPs with Multiple Observations In Hindsight

Jul 06, 2023
Jiacheng Guo, Minshuo Chen, Huan Wang, Caiming Xiong, Mengdi Wang, Yu Bai

Viaarxiv icon

Scaling In-Context Demonstrations with Structured Attention

Jul 05, 2023
Tianle Cai, Kaixuan Huang, Jason D. Lee, Mengdi Wang

Figure 1 for Scaling In-Context Demonstrations with Structured Attention
Figure 2 for Scaling In-Context Demonstrations with Structured Attention
Figure 3 for Scaling In-Context Demonstrations with Structured Attention
Figure 4 for Scaling In-Context Demonstrations with Structured Attention
Viaarxiv icon

Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks

Jul 04, 2023
Kaiqi Zhang, Zixuan Zhang, Minshuo Chen, Mengdi Wang, Tuo Zhao, Yu-Xiang Wang

Figure 1 for Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks
Viaarxiv icon

Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

Jun 26, 2023
Zixuan Zhang, Minshuo Chen, Mengdi Wang, Wenjing Liao, Tuo Zhao

Figure 1 for Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories
Viaarxiv icon