Alert button
Picture for Lewis Hammond

Lewis Hammond

Alert button

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Bookmark button
Alert button
Apr 15, 2024
Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger

Viaarxiv icon

Cooperation and Control in Delegation Games

Add code
Bookmark button
Alert button
Feb 24, 2024
Oliver Sourbut, Lewis Hammond, Harriet Wood

Viaarxiv icon

Secret Collusion Among Generative AI Agents

Add code
Bookmark button
Alert button
Feb 12, 2024
Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, Christian Schroeder de Witt

Viaarxiv icon

Visibility into AI Agents

Add code
Bookmark button
Alert button
Feb 04, 2024
Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung

Viaarxiv icon

Welfare Diplomacy: Benchmarking Language Model Cooperation

Add code
Bookmark button
Alert button
Oct 13, 2023
Gabriel Mukobi, Hannah Erlebach, Niklas Lauffer, Lewis Hammond, Alan Chan, Jesse Clifton

Viaarxiv icon

On Imperfect Recall in Multi-Agent Influence Diagrams

Add code
Bookmark button
Alert button
Jul 11, 2023
James Fox, Matt MacDermott, Lewis Hammond, Paul Harrenstein, Alessandro Abate, Michael Wooldridge

Figure 1 for On Imperfect Recall in Multi-Agent Influence Diagrams
Figure 2 for On Imperfect Recall in Multi-Agent Influence Diagrams
Figure 3 for On Imperfect Recall in Multi-Agent Influence Diagrams
Figure 4 for On Imperfect Recall in Multi-Agent Influence Diagrams
Viaarxiv icon

Reasoning about Causality in Games

Add code
Bookmark button
Alert button
Jan 05, 2023
Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael Wooldridge

Figure 1 for Reasoning about Causality in Games
Figure 2 for Reasoning about Causality in Games
Figure 3 for Reasoning about Causality in Games
Figure 4 for Reasoning about Causality in Games
Viaarxiv icon

Lexicographic Multi-Objective Reinforcement Learning

Add code
Bookmark button
Alert button
Dec 28, 2022
Joar Skalse, Lewis Hammond, Charlie Griffin, Alessandro Abate

Figure 1 for Lexicographic Multi-Objective Reinforcement Learning
Figure 2 for Lexicographic Multi-Objective Reinforcement Learning
Viaarxiv icon

Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives

Add code
Bookmark button
Alert button
Sep 30, 2022
Daniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr., Alessandro Abate

Figure 1 for Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives
Figure 2 for Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives
Figure 3 for Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives
Figure 4 for Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives
Viaarxiv icon