Picture for Csaba Szepesvári

Csaba Szepesvári

Confident Natural Policy Gradient for Local Planning in $q_π$-realizable Constrained MDPs

Add code
Jun 26, 2024
Viaarxiv icon

Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear $q^π$-Realizability and Concentrability

Add code
May 27, 2024
Viaarxiv icon

Regret Minimization via Saddle Point Optimization

Add code
Mar 15, 2024
Figure 1 for Regret Minimization via Saddle Point Optimization
Figure 2 for Regret Minimization via Saddle Point Optimization
Viaarxiv icon

Switching the Loss Reduces the Cost in Batch Reinforcement Learning

Add code
Mar 12, 2024
Figure 1 for Switching the Loss Reduces the Cost in Batch Reinforcement Learning
Figure 2 for Switching the Loss Reduces the Cost in Batch Reinforcement Learning
Figure 3 for Switching the Loss Reduces the Cost in Batch Reinforcement Learning
Viaarxiv icon

Ensemble sampling for linear bandits: small ensembles suffice

Add code
Nov 14, 2023
Viaarxiv icon

Exploration via linearly perturbed loss minimisation

Add code
Nov 13, 2023
Figure 1 for Exploration via linearly perturbed loss minimisation
Figure 2 for Exploration via linearly perturbed loss minimisation
Viaarxiv icon

Stochastic Gradient Descent for Gaussian Processes Done Right

Add code
Oct 31, 2023
Figure 1 for Stochastic Gradient Descent for Gaussian Processes Done Right
Figure 2 for Stochastic Gradient Descent for Gaussian Processes Done Right
Figure 3 for Stochastic Gradient Descent for Gaussian Processes Done Right
Figure 4 for Stochastic Gradient Descent for Gaussian Processes Done Right
Viaarxiv icon

Online RL in Linearly $q^π$-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore

Add code
Oct 11, 2023
Viaarxiv icon

The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation

Add code
Jul 25, 2023
Figure 1 for The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation
Figure 2 for The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation
Figure 3 for The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation
Figure 4 for The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation
Viaarxiv icon

Context-lumpable stochastic bandits

Add code
Jun 22, 2023
Viaarxiv icon