Picture for Josef Sivic

Josef Sivic

GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos

Add code
Dec 12, 2023
Figure 1 for GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos
Figure 2 for GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos
Figure 3 for GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos
Figure 4 for GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos
Viaarxiv icon

Customizing Motion in Text-to-Video Diffusion Models

Add code
Dec 07, 2023
Figure 1 for Customizing Motion in Text-to-Video Diffusion Models
Figure 2 for Customizing Motion in Text-to-Video Diffusion Models
Figure 3 for Customizing Motion in Text-to-Video Diffusion Models
Figure 4 for Customizing Motion in Text-to-Video Diffusion Models
Viaarxiv icon

Visually Guided Model Predictive Robot Control via 6D Object Pose Localization and Tracking

Add code
Nov 09, 2023
Figure 1 for Visually Guided Model Predictive Robot Control via 6D Object Pose Localization and Tracking
Figure 2 for Visually Guided Model Predictive Robot Control via 6D Object Pose Localization and Tracking
Figure 3 for Visually Guided Model Predictive Robot Control via 6D Object Pose Localization and Tracking
Figure 4 for Visually Guided Model Predictive Robot Control via 6D Object Pose Localization and Tracking
Viaarxiv icon

Learning to design protein-protein interactions with enhanced generalization

Add code
Oct 27, 2023
Figure 1 for Learning to design protein-protein interactions with enhanced generalization
Figure 2 for Learning to design protein-protein interactions with enhanced generalization
Figure 3 for Learning to design protein-protein interactions with enhanced generalization
Figure 4 for Learning to design protein-protein interactions with enhanced generalization
Viaarxiv icon

VidChapters-7M: Video Chapters at Scale

Add code
Sep 25, 2023
Viaarxiv icon

Meta-Personalizing Vision-Language Models to Find Named Instances in Video

Add code
Jun 16, 2023
Viaarxiv icon

Language-Guided Music Recommendation for Video via Prompt Analogies

Add code
Jun 15, 2023
Viaarxiv icon

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning

Add code
Mar 21, 2023
Figure 1 for Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Figure 2 for Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Figure 3 for Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Figure 4 for Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Viaarxiv icon

MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare

Add code
Dec 13, 2022
Figure 1 for MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare
Figure 2 for MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare
Figure 3 for MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare
Figure 4 for MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare
Viaarxiv icon

Multi-Task Learning of Object State Changes from Uncurated Videos

Add code
Nov 24, 2022
Viaarxiv icon