Picture for Baifeng Shi

Baifeng Shi

LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning

Add code
Jun 17, 2024
Viaarxiv icon

When Do We Not Need Larger Vision Models?

Add code
Mar 19, 2024
Figure 1 for When Do We Not Need Larger Vision Models?
Figure 2 for When Do We Not Need Larger Vision Models?
Figure 3 for When Do We Not Need Larger Vision Models?
Figure 4 for When Do We Not Need Larger Vision Models?
Viaarxiv icon

Humanoid Locomotion as Next Token Prediction

Add code
Feb 29, 2024
Figure 1 for Humanoid Locomotion as Next Token Prediction
Figure 2 for Humanoid Locomotion as Next Token Prediction
Figure 3 for Humanoid Locomotion as Next Token Prediction
Figure 4 for Humanoid Locomotion as Next Token Prediction
Viaarxiv icon

Rethinking Patch Dependence for Masked Autoencoders

Add code
Jan 25, 2024
Viaarxiv icon

Recursive Visual Programming

Add code
Dec 04, 2023
Figure 1 for Recursive Visual Programming
Figure 2 for Recursive Visual Programming
Figure 3 for Recursive Visual Programming
Figure 4 for Recursive Visual Programming
Viaarxiv icon

LLM-grounded Video Diffusion Models

Add code
Oct 02, 2023
Figure 1 for LLM-grounded Video Diffusion Models
Figure 2 for LLM-grounded Video Diffusion Models
Figure 3 for LLM-grounded Video Diffusion Models
Figure 4 for LLM-grounded Video Diffusion Models
Viaarxiv icon

Robot Learning with Sensorimotor Pre-training

Add code
Jun 16, 2023
Figure 1 for Robot Learning with Sensorimotor Pre-training
Figure 2 for Robot Learning with Sensorimotor Pre-training
Figure 3 for Robot Learning with Sensorimotor Pre-training
Figure 4 for Robot Learning with Sensorimotor Pre-training
Viaarxiv icon

Refocusing Is Key to Transfer Learning

Add code
May 24, 2023
Figure 1 for Refocusing Is Key to Transfer Learning
Figure 2 for Refocusing Is Key to Transfer Learning
Figure 3 for Refocusing Is Key to Transfer Learning
Figure 4 for Refocusing Is Key to Transfer Learning
Viaarxiv icon

Top-Down Visual Attention from Analysis by Synthesis

Add code
Mar 24, 2023
Figure 1 for Top-Down Visual Attention from Analysis by Synthesis
Figure 2 for Top-Down Visual Attention from Analysis by Synthesis
Figure 3 for Top-Down Visual Attention from Analysis by Synthesis
Figure 4 for Top-Down Visual Attention from Analysis by Synthesis
Viaarxiv icon

Visual Attention Emerges from Recurrent Sparse Reconstruction

Add code
Apr 23, 2022
Figure 1 for Visual Attention Emerges from Recurrent Sparse Reconstruction
Figure 2 for Visual Attention Emerges from Recurrent Sparse Reconstruction
Figure 3 for Visual Attention Emerges from Recurrent Sparse Reconstruction
Figure 4 for Visual Attention Emerges from Recurrent Sparse Reconstruction
Viaarxiv icon