Alert button
Picture for Robert Kirk

Robert Kirk

Alert button

Leading the Pack: N-player Opponent Shaping

Add code
Bookmark button
Alert button
Dec 26, 2023
Alexandra Souly, Timon Willi, Akbir Khan, Robert Kirk, Chris Lu, Edward Grefenstette, Tim Rocktäschel

Viaarxiv icon

Generalization to New Sequential Decision Making Tasks with In-Context Learning

Add code
Bookmark button
Alert button
Dec 06, 2023
Sharath Chandra Raparthy, Eric Hambro, Robert Kirk, Mikael Henaff, Roberta Raileanu

Viaarxiv icon

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Add code
Bookmark button
Alert button
Nov 21, 2023
Samyak Jain, Robert Kirk, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka, Edward Grefenstette, Tim Rocktäschel, David Scott Krueger

Viaarxiv icon

Understanding the Effects of RLHF on LLM Generalisation and Diversity

Add code
Bookmark button
Alert button
Oct 10, 2023
Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, Edward Grefenstette, Roberta Raileanu

Viaarxiv icon

Reward Model Ensembles Help Mitigate Overoptimization

Add code
Bookmark button
Alert button
Oct 04, 2023
Thomas Coste, Usman Anwar, Robert Kirk, David Krueger

Viaarxiv icon

Domain Generalization for Robust Model-Based Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Nov 27, 2022
Alan Clark, Shoaib Ahmed Siddiqui, Robert Kirk, Usman Anwar, Stephen Chung, David Krueger

Figure 1 for Domain Generalization for Robust Model-Based Offline Reinforcement Learning
Figure 2 for Domain Generalization for Robust Model-Based Offline Reinforcement Learning
Figure 3 for Domain Generalization for Robust Model-Based Offline Reinforcement Learning
Figure 4 for Domain Generalization for Robust Model-Based Offline Reinforcement Learning
Viaarxiv icon

Graph Backup: Data Efficient Backup Exploiting Markovian Transitions

Add code
Bookmark button
Alert button
May 31, 2022
Zhengyao Jiang, Tianjun Zhang, Robert Kirk, Tim Rocktäschel, Edward Grefenstette

Figure 1 for Graph Backup: Data Efficient Backup Exploiting Markovian Transitions
Figure 2 for Graph Backup: Data Efficient Backup Exploiting Markovian Transitions
Figure 3 for Graph Backup: Data Efficient Backup Exploiting Markovian Transitions
Figure 4 for Graph Backup: Data Efficient Backup Exploiting Markovian Transitions
Viaarxiv icon

Insights From the NeurIPS 2021 NetHack Challenge

Add code
Bookmark button
Alert button
Mar 22, 2022
Eric Hambro, Sharada Mohanty, Dmitrii Babaev, Minwoo Byeon, Dipam Chakraborty, Edward Grefenstette, Minqi Jiang, Daejin Jo, Anssi Kanervisto, Jongmin Kim, Sungwoong Kim, Robert Kirk, Vitaly Kurin, Heinrich Küttler, Taehwon Kwon, Donghoon Lee, Vegard Mella, Nantas Nardelli, Ivan Nazarov, Nikita Ovsov, Jack Parker-Holder, Roberta Raileanu, Karolis Ramanauskas, Tim Rocktäschel, Danielle Rothermel, Mikayel Samvelyan, Dmitry Sorokin, Maciej Sypetkowski, Michał Sypetkowski

Figure 1 for Insights From the NeurIPS 2021 NetHack Challenge
Figure 2 for Insights From the NeurIPS 2021 NetHack Challenge
Figure 3 for Insights From the NeurIPS 2021 NetHack Challenge
Figure 4 for Insights From the NeurIPS 2021 NetHack Challenge
Viaarxiv icon

A Survey of Generalisation in Deep Reinforcement Learning

Add code
Bookmark button
Alert button
Nov 18, 2021
Robert Kirk, Amy Zhang, Edward Grefenstette, Tim Rocktäschel

Figure 1 for A Survey of Generalisation in Deep Reinforcement Learning
Figure 2 for A Survey of Generalisation in Deep Reinforcement Learning
Figure 3 for A Survey of Generalisation in Deep Reinforcement Learning
Figure 4 for A Survey of Generalisation in Deep Reinforcement Learning
Viaarxiv icon

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Add code
Bookmark button
Alert button
Sep 27, 2021
Mikayel Samvelyan, Robert Kirk, Vitaly Kurin, Jack Parker-Holder, Minqi Jiang, Eric Hambro, Fabio Petroni, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel

Figure 1 for MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
Figure 2 for MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
Figure 3 for MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
Figure 4 for MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
Viaarxiv icon