Alert button
Picture for Csaba Szepesvari

Csaba Szepesvari

Alert button

Stochastic Gradient Succeeds for Bandits

Feb 27, 2024
Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvari, Dale Schuurmans

Viaarxiv icon

Sample Efficient Deep Reinforcement Learning via Local Planning

Jan 29, 2023
Dong Yin, Sridhar Thiagarajan, Nevena Lazic, Nived Rajaraman, Botao Hao, Csaba Szepesvari

Figure 1 for Sample Efficient Deep Reinforcement Learning via Local Planning
Figure 2 for Sample Efficient Deep Reinforcement Learning via Local Planning
Figure 3 for Sample Efficient Deep Reinforcement Learning via Local Planning
Figure 4 for Sample Efficient Deep Reinforcement Learning via Local Planning
Viaarxiv icon

The Role of Baselines in Policy Gradient Optimization

Jan 16, 2023
Jincheng Mei, Wesley Chung, Valentin Thomas, Bo Dai, Csaba Szepesvari, Dale Schuurmans

Figure 1 for The Role of Baselines in Policy Gradient Optimization
Figure 2 for The Role of Baselines in Policy Gradient Optimization
Figure 3 for The Role of Baselines in Policy Gradient Optimization
Viaarxiv icon

Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making

Sep 29, 2022
Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvari, Chi Jin

Figure 1 for Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making
Figure 2 for Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making
Viaarxiv icon

Towards Painless Policy Optimization for Constrained MDPs

Apr 11, 2022
Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvari, Doina Precup

Figure 1 for Towards Painless Policy Optimization for Constrained MDPs
Figure 2 for Towards Painless Policy Optimization for Constrained MDPs
Figure 3 for Towards Painless Policy Optimization for Constrained MDPs
Figure 4 for Towards Painless Policy Optimization for Constrained MDPs
Viaarxiv icon

Understanding the Effect of Stochasticity in Policy Optimization

Oct 29, 2021
Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans

Figure 1 for Understanding the Effect of Stochasticity in Policy Optimization
Figure 2 for Understanding the Effect of Stochasticity in Policy Optimization
Figure 3 for Understanding the Effect of Stochasticity in Policy Optimization
Viaarxiv icon

On the Sample Complexity of Batch Reinforcement Learning with Policy-Induced Data

Jun 18, 2021
Chenjun Xiao, Ilbin Lee, Bo Dai, Dale Schuurmans, Csaba Szepesvari

Figure 1 for On the Sample Complexity of Batch Reinforcement Learning with Policy-Induced Data
Figure 2 for On the Sample Complexity of Batch Reinforcement Learning with Policy-Induced Data
Viaarxiv icon

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning

Jun 15, 2021
Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg, Shruti Mishra, Dhruva TB, Arunkumar Byravan, Konstantinos Bousmalis, Andras Gyorgy, Csaba Szepesvari, Raia Hadsell, Nicolas Heess, Martin Riedmiller

Figure 1 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 2 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 3 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 4 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Viaarxiv icon

Leveraging Non-uniformity in First-order Non-convex Optimization

May 13, 2021
Jincheng Mei, Yue Gao, Bo Dai, Csaba Szepesvari, Dale Schuurmans

Figure 1 for Leveraging Non-uniformity in First-order Non-convex Optimization
Figure 2 for Leveraging Non-uniformity in First-order Non-convex Optimization
Figure 3 for Leveraging Non-uniformity in First-order Non-convex Optimization
Figure 4 for Leveraging Non-uniformity in First-order Non-convex Optimization
Viaarxiv icon

On the Optimality of Batch Policy Optimization Algorithms

Apr 06, 2021
Chenjun Xiao, Yifan Wu, Tor Lattimore, Bo Dai, Jincheng Mei, Lihong Li, Csaba Szepesvari, Dale Schuurmans

Figure 1 for On the Optimality of Batch Policy Optimization Algorithms
Figure 2 for On the Optimality of Batch Policy Optimization Algorithms
Viaarxiv icon