Picture for Edward Grefenstette

Edward Grefenstette

Debating with More Persuasive LLMs Leads to More Truthful Answers

Add code
Feb 15, 2024
Figure 1 for Debating with More Persuasive LLMs Leads to More Truthful Answers
Figure 2 for Debating with More Persuasive LLMs Leads to More Truthful Answers
Figure 3 for Debating with More Persuasive LLMs Leads to More Truthful Answers
Figure 4 for Debating with More Persuasive LLMs Leads to More Truthful Answers
Viaarxiv icon

Leading the Pack: N-player Opponent Shaping

Add code
Dec 26, 2023
Figure 1 for Leading the Pack: N-player Opponent Shaping
Figure 2 for Leading the Pack: N-player Opponent Shaping
Figure 3 for Leading the Pack: N-player Opponent Shaping
Figure 4 for Leading the Pack: N-player Opponent Shaping
Viaarxiv icon

Scaling Opponent Shaping to High Dimensional Games

Add code
Dec 19, 2023
Viaarxiv icon

H-GAP: Humanoid Control with a Generalist Planner

Add code
Dec 05, 2023
Figure 1 for H-GAP: Humanoid Control with a Generalist Planner
Figure 2 for H-GAP: Humanoid Control with a Generalist Planner
Figure 3 for H-GAP: Humanoid Control with a Generalist Planner
Figure 4 for H-GAP: Humanoid Control with a Generalist Planner
Viaarxiv icon

minimax: Efficient Baselines for Autocurricula in JAX

Add code
Nov 23, 2023
Viaarxiv icon

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Add code
Nov 21, 2023
Viaarxiv icon

Understanding the Effects of RLHF on LLM Generalisation and Diversity

Add code
Oct 10, 2023
Figure 1 for Understanding the Effects of RLHF on LLM Generalisation and Diversity
Figure 2 for Understanding the Effects of RLHF on LLM Generalisation and Diversity
Figure 3 for Understanding the Effects of RLHF on LLM Generalisation and Diversity
Figure 4 for Understanding the Effects of RLHF on LLM Generalisation and Diversity
Viaarxiv icon

Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions

Add code
Mar 30, 2023
Figure 1 for Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Figure 2 for Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Figure 3 for Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Figure 4 for Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions
Viaarxiv icon

Optimal Transport for Offline Imitation Learning

Add code
Mar 24, 2023
Figure 1 for Optimal Transport for Offline Imitation Learning
Figure 2 for Optimal Transport for Offline Imitation Learning
Figure 3 for Optimal Transport for Offline Imitation Learning
Figure 4 for Optimal Transport for Offline Imitation Learning
Viaarxiv icon

General Intelligence Requires Rethinking Exploration

Add code
Nov 15, 2022
Figure 1 for General Intelligence Requires Rethinking Exploration
Figure 2 for General Intelligence Requires Rethinking Exploration
Figure 3 for General Intelligence Requires Rethinking Exploration
Figure 4 for General Intelligence Requires Rethinking Exploration
Viaarxiv icon