Alert button
Picture for David Krueger

David Krueger

Alert button

Safety Cases: How to Justify the Safety of Advanced AI Systems

Add code
Bookmark button
Alert button
Mar 18, 2024
Joshua Clymer, Nick Gabrieli, David Krueger, Thomas Larsen

Figure 1 for Safety Cases: How to Justify the Safety of Advanced AI Systems
Figure 2 for Safety Cases: How to Justify the Safety of Advanced AI Systems
Figure 3 for Safety Cases: How to Justify the Safety of Advanced AI Systems
Figure 4 for Safety Cases: How to Justify the Safety of Advanced AI Systems
Viaarxiv icon

A Generative Model of Symmetry Transformations

Add code
Bookmark button
Alert button
Mar 04, 2024
James Urquhart Allingham, Bruno Kacper Mlodozeniec, Shreyas Padhy, Javier Antorán, David Krueger, Richard E. Turner, Eric Nalisnick, José Miguel Hernández-Lobato

Figure 1 for A Generative Model of Symmetry Transformations
Figure 2 for A Generative Model of Symmetry Transformations
Figure 3 for A Generative Model of Symmetry Transformations
Figure 4 for A Generative Model of Symmetry Transformations
Viaarxiv icon

Visibility into AI Agents

Add code
Bookmark button
Alert button
Feb 04, 2024
Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung

Viaarxiv icon

Black-Box Access is Insufficient for Rigorous AI Audits

Add code
Bookmark button
Alert button
Jan 25, 2024
Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell

Viaarxiv icon

Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models

Add code
Bookmark button
Alert button
Dec 22, 2023
Alan Chan, Ben Bucknall, Herbie Bradley, David Krueger

Viaarxiv icon

Managing AI Risks in an Era of Rapid Progress

Add code
Bookmark button
Alert button
Oct 26, 2023
Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila McIlraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann

Viaarxiv icon

Meta- (out-of-context) learning in neural networks

Add code
Bookmark button
Alert button
Oct 24, 2023
Dmitrii Krasheninnikov, Egor Krasheninnikov, Bruno Mlodozeniec, David Krueger

Viaarxiv icon

Reward Model Ensembles Help Mitigate Overoptimization

Add code
Bookmark button
Alert button
Oct 04, 2023
Thomas Coste, Usman Anwar, Robert Kirk, David Krueger

Viaarxiv icon