Alert button
Picture for Julia Haas

Julia Haas

Alert button

Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity

Add code
Bookmark button
Alert button
May 29, 2023
Yiran Mao, Madeline G. Reinecke, Markus Kunesch, Edgar A. Duéñez-Guzmán, Ramona Comanescu, Julia Haas, Joel Z. Leibo

Figure 1 for Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity
Figure 2 for Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity
Figure 3 for Doing the right thing for the right reason: Evaluating artificial moral cognition by probing cost insensitivity
Viaarxiv icon

Melting Pot 2.0

Add code
Bookmark button
Alert button
Dec 13, 2022
John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

Figure 1 for Melting Pot 2.0
Figure 2 for Melting Pot 2.0
Figure 3 for Melting Pot 2.0
Figure 4 for Melting Pot 2.0
Viaarxiv icon

Ethical and social risks of harm from Language Models

Add code
Bookmark button
Alert button
Dec 08, 2021
Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

Figure 1 for Ethical and social risks of harm from Language Models
Figure 2 for Ethical and social risks of harm from Language Models
Viaarxiv icon