Alert button
Picture for Pablo Samuel Castro

Pablo Samuel Castro

Alert button

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

Add code
Bookmark button
Alert button
Mar 06, 2024
Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal

Figure 1 for Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Figure 2 for Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Figure 3 for Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Figure 4 for Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Viaarxiv icon

In deep reinforcement learning, a pruned network is a good network

Add code
Bookmark button
Alert button
Feb 19, 2024
Johan Obando-Ceron, Aaron Courville, Pablo Samuel Castro

Viaarxiv icon

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Add code
Bookmark button
Alert button
Feb 13, 2024
Johan Obando-Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Foerster, Gintare Karolina Dziugaite, Doina Precup, Pablo Samuel Castro

Viaarxiv icon

A density estimation perspective on learning from pairwise human preferences

Add code
Bookmark button
Alert button
Nov 30, 2023
Vincent Dumoulin, Daniel D. Johnson, Pablo Samuel Castro, Hugo Larochelle, Yann Dauphin

Viaarxiv icon

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

Add code
Bookmark button
Alert button
Nov 21, 2023
Max Schwarzer, Jesse Farebrother, Joshua Greaves, Ekin Dogus Cubuk, Rishabh Agarwal, Aaron Courville, Marc G. Bellemare, Sergei Kalinin, Igor Mordatch, Pablo Samuel Castro, Kevin M. Roccapriore

Viaarxiv icon

Small batch deep reinforcement learning

Add code
Bookmark button
Alert button
Oct 05, 2023
Johan Obando-Ceron, Marc G. Bellemare, Pablo Samuel Castro

Viaarxiv icon

Offline Reinforcement Learning with On-Policy Q-Function Regularization

Add code
Bookmark button
Alert button
Jul 25, 2023
Laixi Shi, Robert Dadashi, Yuejie Chi, Pablo Samuel Castro, Matthieu Geist

Figure 1 for Offline Reinforcement Learning with On-Policy Q-Function Regularization
Figure 2 for Offline Reinforcement Learning with On-Policy Q-Function Regularization
Figure 3 for Offline Reinforcement Learning with On-Policy Q-Function Regularization
Figure 4 for Offline Reinforcement Learning with On-Policy Q-Function Regularization
Viaarxiv icon

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Add code
Bookmark button
Alert button
Jun 24, 2023
Maxime Chevalier-Boisvert, Bolun Dai, Mark Towers, Rodrigo de Lazcano, Lucas Willems, Salem Lahlou, Suman Pal, Pablo Samuel Castro, Jordan Terry

Figure 1 for Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks
Figure 2 for Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks
Figure 3 for Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks
Figure 4 for Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks
Viaarxiv icon

Bigger, Better, Faster: Human-level Atari with human-level efficiency

Add code
Bookmark button
Alert button
Jun 09, 2023
Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro

Figure 1 for Bigger, Better, Faster: Human-level Atari with human-level efficiency
Figure 2 for Bigger, Better, Faster: Human-level Atari with human-level efficiency
Figure 3 for Bigger, Better, Faster: Human-level Atari with human-level efficiency
Figure 4 for Bigger, Better, Faster: Human-level Atari with human-level efficiency
Viaarxiv icon