Alert button
Picture for Tadashi Kozuno

Tadashi Kozuno

Alert button

Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model

Feb 02, 2023
Wenhao Yang, Han Wang, Tadashi Kozuno, Scott M. Jordan, Zhihua Zhang

Figure 1 for Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model
Figure 2 for Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model
Figure 3 for Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model
Figure 4 for Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model
Viaarxiv icon

Adapting to game trees in zero-sum imperfect information games

Dec 23, 2022
Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko

Figure 1 for Adapting to game trees in zero-sum imperfect information games
Figure 2 for Adapting to game trees in zero-sum imperfect information games
Figure 3 for Adapting to game trees in zero-sum imperfect information games
Figure 4 for Adapting to game trees in zero-sum imperfect information games
Viaarxiv icon

Confident Approximate Policy Iteration for Efficient Local Planning in $q^π$-realizable MDPs

Oct 27, 2022
Gellért Weisz, András György, Tadashi Kozuno, Csaba Szepesvári

Figure 1 for Confident Approximate Policy Iteration for Efficient Local Planning in $q^π$-realizable MDPs
Viaarxiv icon

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal

May 27, 2022
Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári

Figure 1 for KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Figure 2 for KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Figure 3 for KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Figure 4 for KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Viaarxiv icon

No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL

May 18, 2022
Han Wang, Archit Sakhadeo, Adam White, James Bell, Vincent Liu, Xutong Zhao, Puer Liu, Tadashi Kozuno, Alona Fyshe, Martha White

Figure 1 for No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL
Figure 2 for No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL
Figure 3 for No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL
Figure 4 for No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL
Viaarxiv icon

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences

Jul 17, 2021
Alan Chan, Hugo Silva, Sungsu Lim, Tadashi Kozuno, A. Rupam Mahmood, Martha White

Figure 1 for Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Figure 2 for Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Figure 3 for Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Figure 4 for Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Viaarxiv icon

Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation

Jun 24, 2021
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Rémi Munos, Michal Valko

Figure 1 for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Figure 2 for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Figure 3 for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Figure 4 for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Viaarxiv icon

Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall

Jun 11, 2021
Tadashi Kozuno, Pierre Ménard, Rémi Munos, Michal Valko

Figure 1 for Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall
Viaarxiv icon

Identifying Co-Adaptation of Algorithmic and Implementational Innovations in Deep Reinforcement Learning: A Taxonomy and Case Study of Inference-based Algorithms

Mar 31, 2021
Hiroki Furuta, Tadashi Kozuno, Tatsuya Matsushima, Yutaka Matsuo, Shixiang Shane Gu

Figure 1 for Identifying Co-Adaptation of Algorithmic and Implementational Innovations in Deep Reinforcement Learning: A Taxonomy and Case Study of Inference-based Algorithms
Figure 2 for Identifying Co-Adaptation of Algorithmic and Implementational Innovations in Deep Reinforcement Learning: A Taxonomy and Case Study of Inference-based Algorithms
Figure 3 for Identifying Co-Adaptation of Algorithmic and Implementational Innovations in Deep Reinforcement Learning: A Taxonomy and Case Study of Inference-based Algorithms
Figure 4 for Identifying Co-Adaptation of Algorithmic and Implementational Innovations in Deep Reinforcement Learning: A Taxonomy and Case Study of Inference-based Algorithms
Viaarxiv icon

Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning

Mar 23, 2021
Hiroki Furuta, Tatsuya Matsushima, Tadashi Kozuno, Yutaka Matsuo, Sergey Levine, Ofir Nachum, Shixiang Shane Gu

Figure 1 for Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Figure 2 for Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Figure 3 for Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Figure 4 for Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Viaarxiv icon