Picture for Serena Booth

Serena Booth

Towards Improving Reward Design in RL: A Reward Alignment Metric for RL Practitioners

Add code
Mar 08, 2025
Viaarxiv icon

Influencing Humans to Conform to Preference Models for RLHF

Add code
Jan 11, 2025
Figure 1 for Influencing Humans to Conform to Preference Models for RLHF
Figure 2 for Influencing Humans to Conform to Preference Models for RLHF
Figure 3 for Influencing Humans to Conform to Preference Models for RLHF
Figure 4 for Influencing Humans to Conform to Preference Models for RLHF
Viaarxiv icon

Quality-Diversity Generative Sampling for Learning with Synthetic Data

Add code
Dec 22, 2023
Figure 1 for Quality-Diversity Generative Sampling for Learning with Synthetic Data
Figure 2 for Quality-Diversity Generative Sampling for Learning with Synthetic Data
Figure 3 for Quality-Diversity Generative Sampling for Learning with Synthetic Data
Figure 4 for Quality-Diversity Generative Sampling for Learning with Synthetic Data
Viaarxiv icon

Learning Optimal Advantage from Preferences and Mistaking it for Reward

Add code
Oct 03, 2023
Figure 1 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 2 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 3 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 4 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Viaarxiv icon

Models of human preference for learning reward functions

Add code
Jun 05, 2022
Figure 1 for Models of human preference for learning reward functions
Figure 2 for Models of human preference for learning reward functions
Figure 3 for Models of human preference for learning reward functions
Figure 4 for Models of human preference for learning reward functions
Viaarxiv icon

The Irrationality of Neural Rationale Models

Add code
Oct 14, 2021
Figure 1 for The Irrationality of Neural Rationale Models
Figure 2 for The Irrationality of Neural Rationale Models
Figure 3 for The Irrationality of Neural Rationale Models
Figure 4 for The Irrationality of Neural Rationale Models
Viaarxiv icon

Machine Learning Practices Outside Big Tech: How Resource Constraints Challenge Responsible Development

Add code
Oct 06, 2021
Figure 1 for Machine Learning Practices Outside Big Tech: How Resource Constraints Challenge Responsible Development
Viaarxiv icon

Do Feature Attribution Methods Correctly Attribute Features?

Add code
Apr 27, 2021
Figure 1 for Do Feature Attribution Methods Correctly Attribute Features?
Figure 2 for Do Feature Attribution Methods Correctly Attribute Features?
Figure 3 for Do Feature Attribution Methods Correctly Attribute Features?
Figure 4 for Do Feature Attribution Methods Correctly Attribute Features?
Viaarxiv icon

RoCUS: Robot Controller Understanding via Sampling

Add code
Dec 25, 2020
Figure 1 for RoCUS: Robot Controller Understanding via Sampling
Figure 2 for RoCUS: Robot Controller Understanding via Sampling
Figure 3 for RoCUS: Robot Controller Understanding via Sampling
Figure 4 for RoCUS: Robot Controller Understanding via Sampling
Viaarxiv icon

Bayes-Probe: Distribution-Guided Sampling for Prediction Level Sets

Add code
Feb 19, 2020
Figure 1 for Bayes-Probe: Distribution-Guided Sampling for Prediction Level Sets
Figure 2 for Bayes-Probe: Distribution-Guided Sampling for Prediction Level Sets
Figure 3 for Bayes-Probe: Distribution-Guided Sampling for Prediction Level Sets
Figure 4 for Bayes-Probe: Distribution-Guided Sampling for Prediction Level Sets
Viaarxiv icon