Alert button
Picture for Edward Grefenstette

Edward Grefenstette

Alert button

Debating with More Persuasive LLMs Leads to More Truthful Answers

Feb 15, 2024
Akbir Khan, John Hughes, Dan Valentine, Laura Ruis, Kshitij Sachan, Ansh Radhakrishnan, Edward Grefenstette, Samuel R. Bowman, Tim Rocktäschel, Ethan Perez

Viaarxiv icon

Leading the Pack: N-player Opponent Shaping

Dec 26, 2023
Alexandra Souly, Timon Willi, Akbir Khan, Robert Kirk, Chris Lu, Edward Grefenstette, Tim Rocktäschel

Viaarxiv icon

Scaling Opponent Shaping to High Dimensional Games

Dec 19, 2023
Akbir Khan, Timon Willi, Newton Kwan, Andrea Tacchetti, Chris Lu, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster

Viaarxiv icon

H-GAP: Humanoid Control with a Generalist Planner

Dec 05, 2023
Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian

Viaarxiv icon

minimax: Efficient Baselines for Autocurricula in JAX

Nov 23, 2023
Minqi Jiang, Michael Dennis, Edward Grefenstette, Tim Rocktäschel

Viaarxiv icon

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Nov 21, 2023
Samyak Jain, Robert Kirk, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka, Edward Grefenstette, Tim Rocktäschel, David Scott Krueger

Viaarxiv icon

Understanding the Effects of RLHF on LLM Generalisation and Diversity

Oct 10, 2023
Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, Edward Grefenstette, Roberta Raileanu

Viaarxiv icon

Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions

Mar 30, 2023
Yicheng Luo, Jackie Kay, Edward Grefenstette, Marc Peter Deisenroth

Figure 1 for Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Figure 2 for Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Figure 3 for Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Figure 4 for Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Viaarxiv icon