Picture for J. Andrew Bagnell

J. Andrew Bagnell

All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning

Add code
Mar 03, 2025
Figure 1 for All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
Figure 2 for All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
Figure 3 for All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
Figure 4 for All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning
Viaarxiv icon

Hybrid Reinforcement Learning from Offline Observation Alone

Add code
Jun 11, 2024
Figure 1 for Hybrid Reinforcement Learning from Offline Observation Alone
Figure 2 for Hybrid Reinforcement Learning from Offline Observation Alone
Figure 3 for Hybrid Reinforcement Learning from Offline Observation Alone
Figure 4 for Hybrid Reinforcement Learning from Offline Observation Alone
Viaarxiv icon

Understanding Preference Fine-Tuning Through the Lens of Coverage

Add code
Jun 03, 2024
Figure 1 for Understanding Preference Fine-Tuning Through the Lens of Coverage
Figure 2 for Understanding Preference Fine-Tuning Through the Lens of Coverage
Figure 3 for Understanding Preference Fine-Tuning Through the Lens of Coverage
Figure 4 for Understanding Preference Fine-Tuning Through the Lens of Coverage
Viaarxiv icon

REBEL: Reinforcement Learning via Regressing Relative Rewards

Add code
Apr 25, 2024
Viaarxiv icon

Hybrid Inverse Reinforcement Learning

Add code
Feb 13, 2024
Figure 1 for Hybrid Inverse Reinforcement Learning
Figure 2 for Hybrid Inverse Reinforcement Learning
Figure 3 for Hybrid Inverse Reinforcement Learning
Figure 4 for Hybrid Inverse Reinforcement Learning
Viaarxiv icon

The Virtues of Pessimism in Inverse Reinforcement Learning

Add code
Feb 08, 2024
Figure 1 for The Virtues of Pessimism in Inverse Reinforcement Learning
Figure 2 for The Virtues of Pessimism in Inverse Reinforcement Learning
Figure 3 for The Virtues of Pessimism in Inverse Reinforcement Learning
Figure 4 for The Virtues of Pessimism in Inverse Reinforcement Learning
Viaarxiv icon

Inverse Reinforcement Learning without Reinforcement Learning

Add code
Mar 26, 2023
Figure 1 for Inverse Reinforcement Learning without Reinforcement Learning
Figure 2 for Inverse Reinforcement Learning without Reinforcement Learning
Figure 3 for Inverse Reinforcement Learning without Reinforcement Learning
Figure 4 for Inverse Reinforcement Learning without Reinforcement Learning
Viaarxiv icon

The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms

Add code
Mar 01, 2023
Viaarxiv icon

Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient

Add code
Oct 13, 2022
Figure 1 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 2 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 3 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 4 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Viaarxiv icon

Game-Theoretic Algorithms for Conditional Moment Matching

Add code
Aug 19, 2022
Figure 1 for Game-Theoretic Algorithms for Conditional Moment Matching
Figure 2 for Game-Theoretic Algorithms for Conditional Moment Matching
Viaarxiv icon