Picture for Ivan Laptev

Ivan Laptev

WILLOW, LIENS

Reconstructing and grounding narrated instructional videos in 3D

Add code
Sep 10, 2021
Figure 1 for Reconstructing and grounding narrated instructional videos in 3D
Figure 2 for Reconstructing and grounding narrated instructional videos in 3D
Figure 3 for Reconstructing and grounding narrated instructional videos in 3D
Figure 4 for Reconstructing and grounding narrated instructional videos in 3D
Viaarxiv icon

Airbert: In-domain Pretraining for Vision-and-Language Navigation

Add code
Aug 20, 2021
Figure 1 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 2 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 3 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 4 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Viaarxiv icon

Towards unconstrained joint hand-object reconstruction from RGB videos

Add code
Aug 16, 2021
Figure 1 for Towards unconstrained joint hand-object reconstruction from RGB videos
Figure 2 for Towards unconstrained joint hand-object reconstruction from RGB videos
Figure 3 for Towards unconstrained joint hand-object reconstruction from RGB videos
Figure 4 for Towards unconstrained joint hand-object reconstruction from RGB videos
Viaarxiv icon

Goal-Conditioned Reinforcement Learning with Imagined Subgoals

Add code
Jul 01, 2021
Figure 1 for Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Figure 2 for Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Figure 3 for Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Figure 4 for Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Viaarxiv icon

XCiT: Cross-Covariance Image Transformers

Add code
Jun 18, 2021
Figure 1 for XCiT: Cross-Covariance Image Transformers
Figure 2 for XCiT: Cross-Covariance Image Transformers
Figure 3 for XCiT: Cross-Covariance Image Transformers
Figure 4 for XCiT: Cross-Covariance Image Transformers
Viaarxiv icon

Segmenter: Transformer for Semantic Segmentation

Add code
May 12, 2021
Figure 1 for Segmenter: Transformer for Semantic Segmentation
Figure 2 for Segmenter: Transformer for Semantic Segmentation
Figure 3 for Segmenter: Transformer for Semantic Segmentation
Figure 4 for Segmenter: Transformer for Semantic Segmentation
Viaarxiv icon

Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers

Add code
Mar 30, 2021
Figure 1 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 2 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 3 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Figure 4 for Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers
Viaarxiv icon

Training Vision Transformers for Image Retrieval

Add code
Feb 10, 2021
Figure 1 for Training Vision Transformers for Image Retrieval
Figure 2 for Training Vision Transformers for Image Retrieval
Figure 3 for Training Vision Transformers for Image Retrieval
Figure 4 for Training Vision Transformers for Image Retrieval
Viaarxiv icon

Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Add code
Dec 01, 2020
Figure 1 for Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Figure 2 for Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Figure 3 for Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Figure 4 for Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Viaarxiv icon

Learning Object Manipulation Skills via Approximate State Estimation from Real Videos

Add code
Nov 13, 2020
Figure 1 for Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
Figure 2 for Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
Figure 3 for Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
Figure 4 for Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
Viaarxiv icon