Picture for Anikait Singh

Anikait Singh

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Add code
Apr 23, 2024
Figure 1 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 2 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 3 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 4 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Viaarxiv icon

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Add code
Oct 17, 2023
Figure 1 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 2 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 3 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 4 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Viaarxiv icon

Robotic Offline RL from Internet Videos via Value-Function Pre-Training

Add code
Sep 22, 2023
Figure 1 for Robotic Offline RL from Internet Videos via Value-Function Pre-Training
Figure 2 for Robotic Offline RL from Internet Videos via Value-Function Pre-Training
Figure 3 for Robotic Offline RL from Internet Videos via Value-Function Pre-Training
Figure 4 for Robotic Offline RL from Internet Videos via Value-Function Pre-Training
Viaarxiv icon

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Add code
Jul 28, 2023
Figure 1 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 2 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 3 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Figure 4 for RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Viaarxiv icon

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

Add code
Mar 09, 2023
Figure 1 for Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Figure 2 for Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Figure 3 for Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Figure 4 for Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Viaarxiv icon

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints

Add code
Nov 21, 2022
Figure 1 for Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Figure 2 for Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Figure 3 for Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Figure 4 for Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Viaarxiv icon

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

Add code
Oct 11, 2022
Figure 1 for Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Figure 2 for Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Figure 3 for Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Figure 4 for Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Viaarxiv icon

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?

Add code
Apr 12, 2022
Figure 1 for When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Figure 2 for When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Figure 3 for When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Figure 4 for When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Viaarxiv icon

A Workflow for Offline Model-Free Robotic Reinforcement Learning

Add code
Sep 23, 2021
Figure 1 for A Workflow for Offline Model-Free Robotic Reinforcement Learning
Figure 2 for A Workflow for Offline Model-Free Robotic Reinforcement Learning
Figure 3 for A Workflow for Offline Model-Free Robotic Reinforcement Learning
Figure 4 for A Workflow for Offline Model-Free Robotic Reinforcement Learning
Viaarxiv icon