Picture for Scott Niekum

Scott Niekum

Pareto-Optimal Learning from Preferences with Hidden Context

Add code
Jun 21, 2024
Viaarxiv icon

A Dual Approach to Imitation Learning from Observations with Offline Datasets

Add code
Jun 13, 2024
Viaarxiv icon

Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning

Add code
May 06, 2024
Figure 1 for Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Figure 2 for Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Figure 3 for Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Figure 4 for Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Viaarxiv icon

D2PO: Discriminator-Guided DPO with Response Evaluation Models

Add code
May 02, 2024
Figure 1 for D2PO: Discriminator-Guided DPO with Response Evaluation Models
Figure 2 for D2PO: Discriminator-Guided DPO with Response Evaluation Models
Figure 3 for D2PO: Discriminator-Guided DPO with Response Evaluation Models
Figure 4 for D2PO: Discriminator-Guided DPO with Response Evaluation Models
Viaarxiv icon

Automated Discovery of Functional Actual Causes in Complex Environments

Add code
Apr 16, 2024
Figure 1 for Automated Discovery of Functional Actual Causes in Complex Environments
Figure 2 for Automated Discovery of Functional Actual Causes in Complex Environments
Figure 3 for Automated Discovery of Functional Actual Causes in Complex Environments
Figure 4 for Automated Discovery of Functional Actual Causes in Complex Environments
Viaarxiv icon

Learning Action-based Representations Using Invariance

Add code
Mar 25, 2024
Figure 1 for Learning Action-based Representations Using Invariance
Figure 2 for Learning Action-based Representations Using Invariance
Figure 3 for Learning Action-based Representations Using Invariance
Figure 4 for Learning Action-based Representations Using Invariance
Viaarxiv icon

Score Models for Offline Goal-Conditioned Reinforcement Learning

Add code
Nov 03, 2023
Figure 1 for Score Models for Offline Goal-Conditioned Reinforcement Learning
Figure 2 for Score Models for Offline Goal-Conditioned Reinforcement Learning
Figure 3 for Score Models for Offline Goal-Conditioned Reinforcement Learning
Figure 4 for Score Models for Offline Goal-Conditioned Reinforcement Learning
Viaarxiv icon

Contrastive Preference Learning: Learning from Human Feedback without RL

Add code
Oct 24, 2023
Figure 1 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 2 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 3 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 4 for Contrastive Preference Learning: Learning from Human Feedback without RL
Viaarxiv icon

Learning Optimal Advantage from Preferences and Mistaking it for Reward

Add code
Oct 03, 2023
Figure 1 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 2 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 3 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 4 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Viaarxiv icon

Hierarchical Empowerment: Towards Tractable Empowerment-Based Skill-Learning

Add code
Jul 06, 2023
Figure 1 for Hierarchical Empowerment: Towards Tractable Empowerment-Based Skill-Learning
Figure 2 for Hierarchical Empowerment: Towards Tractable Empowerment-Based Skill-Learning
Figure 3 for Hierarchical Empowerment: Towards Tractable Empowerment-Based Skill-Learning
Figure 4 for Hierarchical Empowerment: Towards Tractable Empowerment-Based Skill-Learning
Viaarxiv icon