Alert button
Picture for Joar Skalse

Joar Skalse

Alert button

Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification

Add code
Bookmark button
Alert button
Mar 11, 2024
Joar Skalse, Alessandro Abate

Viaarxiv icon

On the Limitations of Markovian Rewards to Express Multi-Objective, Risk-Sensitive, and Modal Tasks

Add code
Bookmark button
Alert button
Jan 26, 2024
Joar Skalse, Alessandro Abate

Viaarxiv icon

On The Expressivity of Objective-Specification Formalisms in Reinforcement Learning

Add code
Bookmark button
Alert button
Oct 18, 2023
Rohan Subramani, Marcus Williams, Max Heitmann, Halfdan Holm, Charlie Griffin, Joar Skalse

Viaarxiv icon

Goodhart's Law in Reinforcement Learning

Add code
Bookmark button
Alert button
Oct 13, 2023
Jacek Karwowski, Oliver Hayman, Xingjian Bai, Klaus Kiendlhofer, Charlie Griffin, Joar Skalse

Figure 1 for Goodhart's Law in Reinforcement Learning
Figure 2 for Goodhart's Law in Reinforcement Learning
Figure 3 for Goodhart's Law in Reinforcement Learning
Figure 4 for Goodhart's Law in Reinforcement Learning
Viaarxiv icon

STARC: A General Framework For Quantifying Differences Between Reward Functions

Add code
Bookmark button
Alert button
Sep 26, 2023
Joar Skalse, Lucy Farnik, Sumeet Ramesh Motwani, Erik Jenner, Adam Gleave, Alessandro Abate

Figure 1 for STARC: A General Framework For Quantifying Differences Between Reward Functions
Viaarxiv icon

Lexicographic Multi-Objective Reinforcement Learning

Add code
Bookmark button
Alert button
Dec 28, 2022
Joar Skalse, Lewis Hammond, Charlie Griffin, Alessandro Abate

Figure 1 for Lexicographic Multi-Objective Reinforcement Learning
Figure 2 for Lexicographic Multi-Objective Reinforcement Learning
Viaarxiv icon

Misspecification in Inverse Reinforcement Learning

Add code
Bookmark button
Alert button
Dec 06, 2022
Joar Skalse, Alessandro Abate

Viaarxiv icon

Defining and Characterizing Reward Hacking

Add code
Bookmark button
Alert button
Sep 27, 2022
Joar Skalse, Nikolaus H. R. Howe, Dmitrii Krasheninnikov, David Krueger

Figure 1 for Defining and Characterizing Reward Hacking
Figure 2 for Defining and Characterizing Reward Hacking
Figure 3 for Defining and Characterizing Reward Hacking
Figure 4 for Defining and Characterizing Reward Hacking
Viaarxiv icon

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Add code
Bookmark button
Alert button
Mar 14, 2022
Joar Skalse, Matthew Farrugia-Roberts, Stuart Russell, Alessandro Abate, Adam Gleave

Figure 1 for Invariance in Policy Optimisation and Partial Identifiability in Reward Learning
Figure 2 for Invariance in Policy Optimisation and Partial Identifiability in Reward Learning
Viaarxiv icon

A General Counterexample to Any Decision Theory and Some Responses

Add code
Bookmark button
Alert button
Jan 01, 2021
Joar Skalse

Viaarxiv icon