Alert button
Picture for Dmitrii Krasheninnikov

Dmitrii Krasheninnikov

Alert button

Meta- (out-of-context) learning in neural networks

Add code
Bookmark button
Alert button
Oct 24, 2023
Dmitrii Krasheninnikov, Egor Krasheninnikov, Bruno Mlodozeniec, David Krueger

Viaarxiv icon

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Add code
Bookmark button
Alert button
Jul 27, 2023
Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Bıyık, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

Figure 1 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 2 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 3 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 4 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Viaarxiv icon

Defining and Characterizing Reward Hacking

Add code
Bookmark button
Alert button
Sep 27, 2022
Joar Skalse, Nikolaus H. R. Howe, Dmitrii Krasheninnikov, David Krueger

Figure 1 for Defining and Characterizing Reward Hacking
Figure 2 for Defining and Characterizing Reward Hacking
Figure 3 for Defining and Characterizing Reward Hacking
Figure 4 for Defining and Characterizing Reward Hacking
Viaarxiv icon

Combining Reward Information from Multiple Sources

Add code
Bookmark button
Alert button
Mar 22, 2021
Dmitrii Krasheninnikov, Rohin Shah, Herke van Hoof

Figure 1 for Combining Reward Information from Multiple Sources
Figure 2 for Combining Reward Information from Multiple Sources
Viaarxiv icon

Preferences Implicit in the State of the World

Add code
Bookmark button
Alert button
Feb 12, 2019
Rohin Shah, Dmitrii Krasheninnikov, Jordan Alexander, Pieter Abbeel, Anca Dragan

Figure 1 for Preferences Implicit in the State of the World
Figure 2 for Preferences Implicit in the State of the World
Figure 3 for Preferences Implicit in the State of the World
Figure 4 for Preferences Implicit in the State of the World
Viaarxiv icon