Alert button
Picture for Chengzhuo Ni

Chengzhuo Ni

Alert button

Diffusion Model for Data-Driven Black-Box Optimization

Add code
Bookmark button
Alert button
Mar 20, 2024
Zihao Li, Hui Yuan, Kaixuan Huang, Chengzhuo Ni, Yinyu Ye, Minshuo Chen, Mengdi Wang

Figure 1 for Diffusion Model for Data-Driven Black-Box Optimization
Figure 2 for Diffusion Model for Data-Driven Black-Box Optimization
Figure 3 for Diffusion Model for Data-Driven Black-Box Optimization
Figure 4 for Diffusion Model for Data-Driven Black-Box Optimization
Viaarxiv icon

Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

Add code
Bookmark button
Alert button
Jul 13, 2023
Hui Yuan, Kaixuan Huang, Chengzhuo Ni, Minshuo Chen, Mengdi Wang

Figure 1 for Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
Figure 2 for Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
Figure 3 for Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
Figure 4 for Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
Viaarxiv icon

Representation Learning for General-sum Low-rank Markov Games

Add code
Bookmark button
Alert button
Oct 30, 2022
Chengzhuo Ni, Yuda Song, Xuezhou Zhang, Chi Jin, Mengdi Wang

Figure 1 for Representation Learning for General-sum Low-rank Markov Games
Figure 2 for Representation Learning for General-sum Low-rank Markov Games
Figure 3 for Representation Learning for General-sum Low-rank Markov Games
Figure 4 for Representation Learning for General-sum Low-rank Markov Games
Viaarxiv icon

Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization

Add code
Bookmark button
Alert button
Jun 05, 2022
Hui Yuan, Chengzhuo Ni, Huazheng Wang, Xuezhou Zhang, Le Cong, Csaba Szepesvári, Mengdi Wang

Figure 1 for Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization
Figure 2 for Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization
Figure 3 for Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization
Figure 4 for Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization
Viaarxiv icon

Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory

Add code
Bookmark button
Alert button
Feb 10, 2022
Ruiqi Zhang, Xuezhou Zhang, Chengzhuo Ni, Mengdi Wang

Figure 1 for Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory
Viaarxiv icon

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Add code
Bookmark button
Alert button
Jan 31, 2022
Chengzhuo Ni, Ruiqi Zhang, Xiang Ji, Xuezhou Zhang, Mengdi Wang

Figure 1 for Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Figure 2 for Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Figure 3 for Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Figure 4 for Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Viaarxiv icon

Learning Good State and Action Representations via Tensor Decomposition

Add code
Bookmark button
Alert button
May 03, 2021
Chengzhuo Ni, Anru Zhang, Yaqi Duan, Mengdi Wang

Figure 1 for Learning Good State and Action Representations via Tensor Decomposition
Figure 2 for Learning Good State and Action Representations via Tensor Decomposition
Figure 3 for Learning Good State and Action Representations via Tensor Decomposition
Figure 4 for Learning Good State and Action Representations via Tensor Decomposition
Viaarxiv icon

On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

Add code
Bookmark button
Alert button
Feb 17, 2021
Junyu Zhang, Chengzhuo Ni, Zheng Yu, Csaba Szepesvari, Mengdi Wang

Figure 1 for On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Viaarxiv icon

Learning to Control in Metric Space with Optimal Regret

Add code
Bookmark button
Alert button
May 05, 2019
Lin F. Yang, Chengzhuo Ni, Mengdi Wang

Viaarxiv icon