Picture for Takayuki Osa

Takayuki Osa

Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning

Add code
Mar 03, 2026
Viaarxiv icon

Resource-Efficient Model-Free Reinforcement Learning for Board Games

Add code
Feb 11, 2026
Viaarxiv icon

Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning

Add code
Jun 06, 2025
Viaarxiv icon

Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning

Add code
Jun 10, 2024
Figure 1 for Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Figure 2 for Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Figure 3 for Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Figure 4 for Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning
Viaarxiv icon

Stabilizing Extreme Q-learning by Maclaurin Expansion

Add code
Jun 07, 2024
Figure 1 for Stabilizing Extreme Q-learning by Maclaurin Expansion
Figure 2 for Stabilizing Extreme Q-learning by Maclaurin Expansion
Figure 3 for Stabilizing Extreme Q-learning by Maclaurin Expansion
Figure 4 for Stabilizing Extreme Q-learning by Maclaurin Expansion
Viaarxiv icon

Offline Reinforcement Learning from Datasets with Structured Non-Stationarity

Add code
May 23, 2024
Figure 1 for Offline Reinforcement Learning from Datasets with Structured Non-Stationarity
Figure 2 for Offline Reinforcement Learning from Datasets with Structured Non-Stationarity
Figure 3 for Offline Reinforcement Learning from Datasets with Structured Non-Stationarity
Figure 4 for Offline Reinforcement Learning from Datasets with Structured Non-Stationarity
Viaarxiv icon

Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning

Add code
Mar 12, 2024
Figure 1 for Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
Figure 2 for Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
Figure 3 for Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
Figure 4 for Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
Viaarxiv icon

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Add code
Oct 17, 2023
Figure 1 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 2 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 3 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 4 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Viaarxiv icon

Motion Planning by Learning the Solution Manifold in Trajectory Optimization

Add code
Jul 13, 2021
Figure 1 for Motion Planning by Learning the Solution Manifold in Trajectory Optimization
Figure 2 for Motion Planning by Learning the Solution Manifold in Trajectory Optimization
Figure 3 for Motion Planning by Learning the Solution Manifold in Trajectory Optimization
Figure 4 for Motion Planning by Learning the Solution Manifold in Trajectory Optimization
Viaarxiv icon

Discovering Diverse Solutions in Deep Reinforcement Learning

Add code
Mar 12, 2021
Figure 1 for Discovering Diverse Solutions in Deep Reinforcement Learning
Figure 2 for Discovering Diverse Solutions in Deep Reinforcement Learning
Figure 3 for Discovering Diverse Solutions in Deep Reinforcement Learning
Figure 4 for Discovering Diverse Solutions in Deep Reinforcement Learning
Viaarxiv icon