Alert button
Picture for Csaba Szepesvari

Csaba Szepesvari

Alert button

Improved Regret Bound and Experience Replay in Regularized Policy Iteration

Add code
Bookmark button
Alert button
Feb 25, 2021
Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvari

Figure 1 for Improved Regret Bound and Experience Replay in Regularized Policy Iteration
Viaarxiv icon

On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

Add code
Bookmark button
Alert button
Feb 17, 2021
Junyu Zhang, Chengzhuo Ni, Zheng Yu, Csaba Szepesvari, Mengdi Wang

Figure 1 for On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Viaarxiv icon

Meta-Thompson Sampling

Add code
Bookmark button
Alert button
Feb 11, 2021
Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari

Figure 1 for Meta-Thompson Sampling
Figure 2 for Meta-Thompson Sampling
Figure 3 for Meta-Thompson Sampling
Viaarxiv icon

Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes

Add code
Bookmark button
Alert button
Jan 07, 2021
Dongruo Zhou, Quanquan Gu, Csaba Szepesvari

Figure 1 for Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes
Viaarxiv icon

Variational Policy Gradient Method for Reinforcement Learning with General Utilities

Add code
Bookmark button
Alert button
Jul 04, 2020
Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvari, Mengdi Wang

Figure 1 for Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Figure 2 for Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Figure 3 for Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Viaarxiv icon

PAC-Bayes Analysis Beyond the Usual Bounds

Add code
Bookmark button
Alert button
Jun 23, 2020
Omar Rivasplata, Ilja Kuzborskij, Csaba Szepesvari, John Shawe-Taylor

Viaarxiv icon

Differentiable Meta-Learning in Contextual Bandits

Add code
Bookmark button
Alert button
Jun 09, 2020
Branislav Kveton, Martin Mladenov, Chih-Wei Hsu, Manzil Zaheer, Csaba Szepesvari, Craig Boutilier

Figure 1 for Differentiable Meta-Learning in Contextual Bandits
Figure 2 for Differentiable Meta-Learning in Contextual Bandits
Figure 3 for Differentiable Meta-Learning in Contextual Bandits
Figure 4 for Differentiable Meta-Learning in Contextual Bandits
Viaarxiv icon

Model-Based Reinforcement Learning with Value-Targeted Regression

Add code
Bookmark button
Alert button
Jun 01, 2020
Alex Ayoub, Zeyu Jia, Csaba Szepesvari, Mengdi Wang, Lin F. Yang

Figure 1 for Model-Based Reinforcement Learning with Value-Targeted Regression
Figure 2 for Model-Based Reinforcement Learning with Value-Targeted Regression
Figure 3 for Model-Based Reinforcement Learning with Value-Targeted Regression
Figure 4 for Model-Based Reinforcement Learning with Value-Targeted Regression
Viaarxiv icon

On the Global Convergence Rates of Softmax Policy Gradient Methods

Add code
Bookmark button
Alert button
May 13, 2020
Jincheng Mei, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans

Figure 1 for On the Global Convergence Rates of Softmax Policy Gradient Methods
Figure 2 for On the Global Convergence Rates of Softmax Policy Gradient Methods
Figure 3 for On the Global Convergence Rates of Softmax Policy Gradient Methods
Figure 4 for On the Global Convergence Rates of Softmax Policy Gradient Methods
Viaarxiv icon