Picture for Meera Hahn

Meera Hahn

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Add code
Dec 21, 2023
Figure 1 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 2 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 3 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 4 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Viaarxiv icon

Photorealistic Video Generation with Diffusion Models

Add code
Dec 11, 2023
Figure 1 for Photorealistic Video Generation with Diffusion Models
Figure 2 for Photorealistic Video Generation with Diffusion Models
Figure 3 for Photorealistic Video Generation with Diffusion Models
Figure 4 for Photorealistic Video Generation with Diffusion Models
Viaarxiv icon

Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model

Add code
Nov 30, 2023
Figure 1 for Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model
Figure 2 for Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model
Figure 3 for Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model
Figure 4 for Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model
Viaarxiv icon

Text and Click inputs for unambiguous open vocabulary instance segmentation

Add code
Nov 24, 2023
Figure 1 for Text and Click inputs for unambiguous open vocabulary instance segmentation
Figure 2 for Text and Click inputs for unambiguous open vocabulary instance segmentation
Figure 3 for Text and Click inputs for unambiguous open vocabulary instance segmentation
Figure 4 for Text and Click inputs for unambiguous open vocabulary instance segmentation
Viaarxiv icon

Transformer-based Localization from Embodied Dialog with Large-scale Pre-training

Add code
Oct 10, 2022
Figure 1 for Transformer-based Localization from Embodied Dialog with Large-scale Pre-training
Figure 2 for Transformer-based Localization from Embodied Dialog with Large-scale Pre-training
Figure 3 for Transformer-based Localization from Embodied Dialog with Large-scale Pre-training
Figure 4 for Transformer-based Localization from Embodied Dialog with Large-scale Pre-training
Viaarxiv icon

Learning a Visually Grounded Memory Assistant

Add code
Oct 07, 2022
Figure 1 for Learning a Visually Grounded Memory Assistant
Figure 2 for Learning a Visually Grounded Memory Assistant
Figure 3 for Learning a Visually Grounded Memory Assistant
Figure 4 for Learning a Visually Grounded Memory Assistant
Viaarxiv icon

No RL, No Simulation: Learning to Navigate without Navigating

Add code
Oct 22, 2021
Figure 1 for No RL, No Simulation: Learning to Navigate without Navigating
Figure 2 for No RL, No Simulation: Learning to Navigate without Navigating
Figure 3 for No RL, No Simulation: Learning to Navigate without Navigating
Figure 4 for No RL, No Simulation: Learning to Navigate without Navigating
Viaarxiv icon

Where Are You? Localization from Embodied Dialog

Add code
Nov 16, 2020
Figure 1 for Where Are You? Localization from Embodied Dialog
Figure 2 for Where Are You? Localization from Embodied Dialog
Figure 3 for Where Are You? Localization from Embodied Dialog
Figure 4 for Where Are You? Localization from Embodied Dialog
Viaarxiv icon

Tripping through time: Efficient Localization of Activities in Videos

Add code
Apr 25, 2019
Figure 1 for Tripping through time: Efficient Localization of Activities in Videos
Figure 2 for Tripping through time: Efficient Localization of Activities in Videos
Figure 3 for Tripping through time: Efficient Localization of Activities in Videos
Figure 4 for Tripping through time: Efficient Localization of Activities in Videos
Viaarxiv icon

Action2Vec: A Crossmodal Embedding Approach to Action Learning

Add code
Jan 02, 2019
Figure 1 for Action2Vec: A Crossmodal Embedding Approach to Action Learning
Figure 2 for Action2Vec: A Crossmodal Embedding Approach to Action Learning
Figure 3 for Action2Vec: A Crossmodal Embedding Approach to Action Learning
Figure 4 for Action2Vec: A Crossmodal Embedding Approach to Action Learning
Viaarxiv icon