Alert button
Picture for Dylan Hadfield-Menell

Dylan Hadfield-Menell

Alert button

Recommending to Strategic Users

Add code
Bookmark button
Alert button
Feb 13, 2023
Andreas Haupt, Dylan Hadfield-Menell, Chara Podimata

Figure 1 for Recommending to Strategic Users
Figure 2 for Recommending to Strategic Users
Figure 3 for Recommending to Strategic Users
Figure 4 for Recommending to Strategic Users
Viaarxiv icon

Benchmarking Interpretability Tools for Deep Neural Networks

Add code
Bookmark button
Alert button
Feb 08, 2023
Stephen Casper, Yuxiao Li, Jiawei Li, Tong Bu, Kevin Zhang, Dylan Hadfield-Menell

Figure 1 for Benchmarking Interpretability Tools for Deep Neural Networks
Figure 2 for Benchmarking Interpretability Tools for Deep Neural Networks
Figure 3 for Benchmarking Interpretability Tools for Deep Neural Networks
Figure 4 for Benchmarking Interpretability Tools for Deep Neural Networks
Viaarxiv icon

Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks

Add code
Bookmark button
Alert button
Nov 22, 2022
Stephen Casper, Kaivalya Hariharan, Dylan Hadfield-Menell

Figure 1 for Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks
Figure 2 for Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks
Figure 3 for Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks
Figure 4 for Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks
Viaarxiv icon

White-Box Adversarial Policies in Deep Reinforcement Learning

Add code
Bookmark button
Alert button
Sep 05, 2022
Stephen Casper, Dylan Hadfield-Menell, Gabriel Kreiman

Figure 1 for White-Box Adversarial Policies in Deep Reinforcement Learning
Figure 2 for White-Box Adversarial Policies in Deep Reinforcement Learning
Figure 3 for White-Box Adversarial Policies in Deep Reinforcement Learning
Figure 4 for White-Box Adversarial Policies in Deep Reinforcement Learning
Viaarxiv icon

Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Add code
Bookmark button
Alert button
Aug 22, 2022
Phillip J. K. Christoffersen, Andreas A. Haupt, Dylan Hadfield-Menell

Figure 1 for Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL
Figure 2 for Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL
Figure 3 for Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL
Figure 4 for Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL
Viaarxiv icon

Towards Psychologically-Grounded Dynamic Preference Models

Add code
Bookmark button
Alert button
Aug 06, 2022
Mihaela Curmei, Andreas Haupt, Dylan Hadfield-Menell, Benjamin Recht

Figure 1 for Towards Psychologically-Grounded Dynamic Preference Models
Figure 2 for Towards Psychologically-Grounded Dynamic Preference Models
Figure 3 for Towards Psychologically-Grounded Dynamic Preference Models
Figure 4 for Towards Psychologically-Grounded Dynamic Preference Models
Viaarxiv icon

Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks

Add code
Bookmark button
Alert button
Jul 28, 2022
Tilman Räuker, Anson Ho, Stephen Casper, Dylan Hadfield-Menell

Figure 1 for Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Figure 2 for Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Figure 3 for Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Figure 4 for Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Viaarxiv icon

Building Human Values into Recommender Systems: An Interdisciplinary Synthesis

Add code
Bookmark button
Alert button
Jul 20, 2022
Jonathan Stray, Alon Halevy, Parisa Assar, Dylan Hadfield-Menell, Craig Boutilier, Amar Ashar, Lex Beattie, Michael Ekstrand, Claire Leibowicz, Connie Moon Sehat, Sara Johansen, Lianne Kerlin, David Vickrey, Spandana Singh, Sanne Vrijenhoek, Amy Zhang, McKane Andrus, Natali Helberger, Polina Proutskova, Tanushree Mitra, Nina Vasan

Figure 1 for Building Human Values into Recommender Systems: An Interdisciplinary Synthesis
Figure 2 for Building Human Values into Recommender Systems: An Interdisciplinary Synthesis
Figure 3 for Building Human Values into Recommender Systems: An Interdisciplinary Synthesis
Figure 4 for Building Human Values into Recommender Systems: An Interdisciplinary Synthesis
Viaarxiv icon

How to talk so your robot will learn: Instructions, descriptions, and pragmatics

Add code
Bookmark button
Alert button
Jun 16, 2022
Theodore R Sumers, Robert D Hawkins, Mark K Ho, Thomas L Griffiths, Dylan Hadfield-Menell

Figure 1 for How to talk so your robot will learn: Instructions, descriptions, and pragmatics
Figure 2 for How to talk so your robot will learn: Instructions, descriptions, and pragmatics
Figure 3 for How to talk so your robot will learn: Instructions, descriptions, and pragmatics
Figure 4 for How to talk so your robot will learn: Instructions, descriptions, and pragmatics
Viaarxiv icon