Picture for Archit Sharma

Archit Sharma

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Add code
Apr 23, 2024
Figure 1 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 2 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 3 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 4 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Viaarxiv icon

Stream of Search : Learning to Search in Language

Add code
Apr 01, 2024
Viaarxiv icon

Yell At Your Robot: Improving On-the-Fly from Language Corrections

Add code
Mar 19, 2024
Figure 1 for Yell At Your Robot: Improving On-the-Fly from Language Corrections
Figure 2 for Yell At Your Robot: Improving On-the-Fly from Language Corrections
Figure 3 for Yell At Your Robot: Improving On-the-Fly from Language Corrections
Figure 4 for Yell At Your Robot: Improving On-the-Fly from Language Corrections
Viaarxiv icon

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Add code
Mar 19, 2024
Figure 1 for DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Figure 2 for DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Figure 3 for DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Figure 4 for DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Viaarxiv icon

A Critical Evaluation of AI Feedback for Aligning Large Language Models

Add code
Feb 19, 2024
Figure 1 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 2 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 3 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 4 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Viaarxiv icon

RLVF: Learning from Verbal Feedback without Overgeneralization

Add code
Feb 16, 2024
Viaarxiv icon

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

Add code
Feb 01, 2024
Viaarxiv icon

Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment

Add code
Nov 02, 2023
Figure 1 for Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment
Figure 2 for Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment
Figure 3 for Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment
Figure 4 for Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment
Viaarxiv icon

Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning

Add code
Oct 23, 2023
Figure 1 for Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning
Figure 2 for Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning
Figure 3 for Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning
Figure 4 for Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning
Viaarxiv icon

An Emulator for Fine-Tuning Large Language Models using Small Language Models

Add code
Oct 19, 2023
Figure 1 for An Emulator for Fine-Tuning Large Language Models using Small Language Models
Figure 2 for An Emulator for Fine-Tuning Large Language Models using Small Language Models
Figure 3 for An Emulator for Fine-Tuning Large Language Models using Small Language Models
Figure 4 for An Emulator for Fine-Tuning Large Language Models using Small Language Models
Viaarxiv icon