Picture for Daniel J. Mankowitz

Daniel J. Mankowitz

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Nash Learning from Human Feedback

Add code
Dec 06, 2023
Figure 1 for Nash Learning from Human Feedback
Figure 2 for Nash Learning from Human Feedback
Figure 3 for Nash Learning from Human Feedback
Figure 4 for Nash Learning from Human Feedback
Viaarxiv icon

Towards practical reinforcement learning for tokamak magnetic control

Add code
Jul 21, 2023
Figure 1 for Towards practical reinforcement learning for tokamak magnetic control
Figure 2 for Towards practical reinforcement learning for tokamak magnetic control
Figure 3 for Towards practical reinforcement learning for tokamak magnetic control
Figure 4 for Towards practical reinforcement learning for tokamak magnetic control
Viaarxiv icon

Optimizing Memory Mapping Using Deep Reinforcement Learning

Add code
May 11, 2023
Figure 1 for Optimizing Memory Mapping Using Deep Reinforcement Learning
Figure 2 for Optimizing Memory Mapping Using Deep Reinforcement Learning
Figure 3 for Optimizing Memory Mapping Using Deep Reinforcement Learning
Figure 4 for Optimizing Memory Mapping Using Deep Reinforcement Learning
Viaarxiv icon

Controlling Commercial Cooling Systems Using Reinforcement Learning

Add code
Nov 11, 2022
Figure 1 for Controlling Commercial Cooling Systems Using Reinforcement Learning
Figure 2 for Controlling Commercial Cooling Systems Using Reinforcement Learning
Figure 3 for Controlling Commercial Cooling Systems Using Reinforcement Learning
Figure 4 for Controlling Commercial Cooling Systems Using Reinforcement Learning
Viaarxiv icon

COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

Add code
Apr 19, 2022
Figure 1 for COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation
Figure 2 for COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation
Figure 3 for COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation
Figure 4 for COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation
Viaarxiv icon

MuZero with Self-competition for Rate Control in VP9 Video Compression

Add code
Feb 14, 2022
Figure 1 for MuZero with Self-competition for Rate Control in VP9 Video Compression
Figure 2 for MuZero with Self-competition for Rate Control in VP9 Video Compression
Figure 3 for MuZero with Self-competition for Rate Control in VP9 Video Compression
Figure 4 for MuZero with Self-competition for Rate Control in VP9 Video Compression
Viaarxiv icon

Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification

Add code
Oct 20, 2020
Figure 1 for Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification
Figure 2 for Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification
Figure 3 for Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification
Figure 4 for Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification
Viaarxiv icon

Balancing Constraints and Rewards with Meta-Gradient D4PG

Add code
Oct 13, 2020
Figure 1 for Balancing Constraints and Rewards with Meta-Gradient D4PG
Figure 2 for Balancing Constraints and Rewards with Meta-Gradient D4PG
Figure 3 for Balancing Constraints and Rewards with Meta-Gradient D4PG
Figure 4 for Balancing Constraints and Rewards with Meta-Gradient D4PG
Viaarxiv icon

An empirical investigation of the challenges of real-world reinforcement learning

Add code
Mar 24, 2020
Figure 1 for An empirical investigation of the challenges of real-world reinforcement learning
Figure 2 for An empirical investigation of the challenges of real-world reinforcement learning
Figure 3 for An empirical investigation of the challenges of real-world reinforcement learning
Figure 4 for An empirical investigation of the challenges of real-world reinforcement learning
Viaarxiv icon