Alert button
Picture for Pierluca D'Oro

Pierluca D'Oro

Alert button

Do Transformer World Models Give Better Policy Gradients?

Feb 11, 2024
Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-Luc Bacon

Viaarxiv icon

Motif: Intrinsic Motivation from Artificial Intelligence Feedback

Sep 29, 2023
Martin Klissarov, Pierluca D'Oro, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff

Figure 1 for Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Figure 2 for Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Figure 3 for Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Figure 4 for Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Viaarxiv icon

Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

Sep 26, 2023
Nate Rahn, Pierluca D'Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare

Figure 1 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 2 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 3 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Figure 4 for Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Viaarxiv icon

The Primacy Bias in Deep Reinforcement Learning

May 16, 2022
Evgenii Nikishin, Max Schwarzer, Pierluca D'Oro, Pierre-Luc Bacon, Aaron Courville

Figure 1 for The Primacy Bias in Deep Reinforcement Learning
Figure 2 for The Primacy Bias in Deep Reinforcement Learning
Figure 3 for The Primacy Bias in Deep Reinforcement Learning
Figure 4 for The Primacy Bias in Deep Reinforcement Learning
Viaarxiv icon

Policy Optimization as Online Learning with Mediator Feedback

Dec 15, 2020
Alberto Maria Metelli, Matteo Papini, Pierluca D'Oro, Marcello Restelli

Figure 1 for Policy Optimization as Online Learning with Mediator Feedback
Figure 2 for Policy Optimization as Online Learning with Mediator Feedback
Figure 3 for Policy Optimization as Online Learning with Mediator Feedback
Figure 4 for Policy Optimization as Online Learning with Mediator Feedback
Viaarxiv icon

How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization

Apr 29, 2020
Pierluca D'Oro, Wojciech Jaśkowski

Figure 1 for How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Figure 2 for How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Figure 3 for How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Figure 4 for How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
Viaarxiv icon

Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs

Apr 07, 2020
Giorgio Giannone, Asha Anoosheh, Alessio Quaglino, Pierluca D'Oro, Marco Gallieri, Jonathan Masci

Figure 1 for Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs
Figure 2 for Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs
Figure 3 for Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs
Figure 4 for Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs
Viaarxiv icon

Gradient-Aware Model-based Policy Search

Sep 09, 2019
Pierluca D'Oro, Alberto Maria Metelli, Andrea Tirinzoni, Matteo Papini, Marcello Restelli

Figure 1 for Gradient-Aware Model-based Policy Search
Figure 2 for Gradient-Aware Model-based Policy Search
Figure 3 for Gradient-Aware Model-based Policy Search
Figure 4 for Gradient-Aware Model-based Policy Search
Viaarxiv icon