Picture for Zhaoyi Zhou

Zhaoyi Zhou

Shrinking the Variance: Shrinkage Baselines for Reinforcement Learning with Verifiable Rewards

Add code
Nov 05, 2025
Viaarxiv icon

Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning

Add code
Oct 30, 2023
Figure 1 for Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
Figure 2 for Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
Figure 3 for Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
Figure 4 for Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
Viaarxiv icon

Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games

Add code
Mar 08, 2023
Figure 1 for Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games
Figure 2 for Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games
Viaarxiv icon