Picture for Scott Niekum

Scott Niekum

Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation

Add code
Apr 20, 2025
Viaarxiv icon

An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning

Add code
Apr 17, 2025
Viaarxiv icon

Fast Adaptation with Behavioral Foundation Models

Add code
Apr 10, 2025
Viaarxiv icon

Supervised Reward Inference

Add code
Feb 25, 2025
Viaarxiv icon

Influencing Humans to Conform to Preference Models for RLHF

Add code
Jan 11, 2025
Figure 1 for Influencing Humans to Conform to Preference Models for RLHF
Figure 2 for Influencing Humans to Conform to Preference Models for RLHF
Figure 3 for Influencing Humans to Conform to Preference Models for RLHF
Figure 4 for Influencing Humans to Conform to Preference Models for RLHF
Viaarxiv icon

RL Zero: Zero-Shot Language to Behaviors without any Supervision

Add code
Dec 07, 2024
Viaarxiv icon

Predicting Future Actions of Reinforcement Learning Agents

Add code
Oct 29, 2024
Viaarxiv icon

SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions

Add code
Oct 24, 2024
Figure 1 for SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
Figure 2 for SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
Figure 3 for SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
Figure 4 for SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
Viaarxiv icon

Pareto-Optimal Learning from Preferences with Hidden Context

Add code
Jun 21, 2024
Viaarxiv icon

A Dual Approach to Imitation Learning from Observations with Offline Datasets

Add code
Jun 13, 2024
Figure 1 for A Dual Approach to Imitation Learning from Observations with Offline Datasets
Figure 2 for A Dual Approach to Imitation Learning from Observations with Offline Datasets
Figure 3 for A Dual Approach to Imitation Learning from Observations with Offline Datasets
Figure 4 for A Dual Approach to Imitation Learning from Observations with Offline Datasets
Viaarxiv icon