Alert button
Picture for Makarand Tapaswi

Makarand Tapaswi

Alert button

Grounded Video Situation Recognition

Oct 19, 2022
Zeeshan Khan, C. V. Jawahar, Makarand Tapaswi

Figure 1 for Grounded Video Situation Recognition
Figure 2 for Grounded Video Situation Recognition
Figure 3 for Grounded Video Situation Recognition
Figure 4 for Grounded Video Situation Recognition
Viaarxiv icon

Instruction-driven history-aware policies for robotic manipulations

Sep 22, 2022
Pierre-Louis Guhur, Shizhe Chen, Ricardo Garcia, Makarand Tapaswi, Ivan Laptev, Cordelia Schmid

Figure 1 for Instruction-driven history-aware policies for robotic manipulations
Figure 2 for Instruction-driven history-aware policies for robotic manipulations
Figure 3 for Instruction-driven history-aware policies for robotic manipulations
Figure 4 for Instruction-driven history-aware policies for robotic manipulations
Viaarxiv icon

Learning from Unlabeled 3D Environments for Vision-and-Language Navigation

Aug 24, 2022
Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev

Figure 1 for Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Figure 2 for Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Figure 3 for Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Figure 4 for Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Viaarxiv icon

Learning Object Manipulation Skills from Video via Approximate Differentiable Physics

Aug 03, 2022
Vladimir Petrik, Mohammad Nomaan Qureshi, Josef Sivic, Makarand Tapaswi

Figure 1 for Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Figure 2 for Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Figure 3 for Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Figure 4 for Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Viaarxiv icon

Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation

Feb 23, 2022
Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev

Figure 1 for Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Figure 2 for Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Figure 3 for Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Figure 4 for Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Viaarxiv icon

Feature Generation for Long-tail Classification

Nov 10, 2021
Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi

Figure 1 for Feature Generation for Long-tail Classification
Figure 2 for Feature Generation for Long-tail Classification
Figure 3 for Feature Generation for Long-tail Classification
Figure 4 for Feature Generation for Long-tail Classification
Viaarxiv icon

Airbert: In-domain Pretraining for Vision-and-Language Navigation

Aug 20, 2021
Pierre-Louis Guhur, Makarand Tapaswi, Shizhe Chen, Ivan Laptev, Cordelia Schmid

Figure 1 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 2 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 3 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 4 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Viaarxiv icon

Learning Object Manipulation Skills via Approximate State Estimation from Real Videos

Nov 13, 2020
Vladimír Petrík, Makarand Tapaswi, Ivan Laptev, Josef Sivic

Figure 1 for Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
Figure 2 for Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
Figure 3 for Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
Figure 4 for Learning Object Manipulation Skills via Approximate State Estimation from Real Videos
Viaarxiv icon

Deep Multimodal Feature Encoding for Video Ordering

Apr 05, 2020
Vivek Sharma, Makarand Tapaswi, Rainer Stiefelhagen

Figure 1 for Deep Multimodal Feature Encoding for Video Ordering
Figure 2 for Deep Multimodal Feature Encoding for Video Ordering
Figure 3 for Deep Multimodal Feature Encoding for Video Ordering
Figure 4 for Deep Multimodal Feature Encoding for Video Ordering
Viaarxiv icon