Picture for Zhaoran Wang Csaba Szepesvári

Zhaoran Wang Csaba Szepesvári

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs

Add code
Oct 18, 2021
Figure 1 for Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs
Viaarxiv icon