Alert button
Picture for Ben Weinstein-Raun

Ben Weinstein-Raun

Alert button

Thousands of AI Authors on the Future of AI

Add code
Bookmark button
Alert button
Jan 05, 2024
Katja Grace, Harlan Stewart, Julia Fabienne Sandkühler, Stephen Thomas, Ben Weinstein-Raun, Jan Brauner

Viaarxiv icon

Adversarial Training for High-Stakes Reliability

Add code
Bookmark button
Alert button
May 04, 2022
Daniel M. Ziegler, Seraphina Nix, Lawrence Chan, Tim Bauman, Peter Schmidt-Nielsen, Tao Lin, Adam Scherlis, Noa Nabeshima, Ben Weinstein-Raun, Daniel de Haas, Buck Shlegeris, Nate Thomas

Figure 1 for Adversarial Training for High-Stakes Reliability
Figure 2 for Adversarial Training for High-Stakes Reliability
Figure 3 for Adversarial Training for High-Stakes Reliability
Figure 4 for Adversarial Training for High-Stakes Reliability
Viaarxiv icon