Picture for Makarand Tapaswi

Makarand Tapaswi

CVIT, IIIT Hyderabad

Can we Adopt Self-supervised Pretraining for Chest X-Rays?

Add code
Nov 23, 2022
Figure 1 for Can we Adopt Self-supervised Pretraining for Chest X-Rays?
Figure 2 for Can we Adopt Self-supervised Pretraining for Chest X-Rays?
Figure 3 for Can we Adopt Self-supervised Pretraining for Chest X-Rays?
Figure 4 for Can we Adopt Self-supervised Pretraining for Chest X-Rays?
Viaarxiv icon

Language Conditioned Spatial Relation Reasoning for 3D Object Grounding

Add code
Nov 17, 2022
Figure 1 for Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Figure 2 for Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Figure 3 for Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Figure 4 for Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Viaarxiv icon

Unsupervised Audio-Visual Lecture Segmentation

Add code
Oct 29, 2022
Figure 1 for Unsupervised Audio-Visual Lecture Segmentation
Figure 2 for Unsupervised Audio-Visual Lecture Segmentation
Figure 3 for Unsupervised Audio-Visual Lecture Segmentation
Figure 4 for Unsupervised Audio-Visual Lecture Segmentation
Viaarxiv icon

Grounded Video Situation Recognition

Add code
Oct 19, 2022
Figure 1 for Grounded Video Situation Recognition
Figure 2 for Grounded Video Situation Recognition
Figure 3 for Grounded Video Situation Recognition
Figure 4 for Grounded Video Situation Recognition
Viaarxiv icon

Instruction-driven history-aware policies for robotic manipulations

Add code
Sep 22, 2022
Figure 1 for Instruction-driven history-aware policies for robotic manipulations
Figure 2 for Instruction-driven history-aware policies for robotic manipulations
Figure 3 for Instruction-driven history-aware policies for robotic manipulations
Figure 4 for Instruction-driven history-aware policies for robotic manipulations
Viaarxiv icon

Learning from Unlabeled 3D Environments for Vision-and-Language Navigation

Add code
Aug 24, 2022
Figure 1 for Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Figure 2 for Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Figure 3 for Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Figure 4 for Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Viaarxiv icon

Learning Object Manipulation Skills from Video via Approximate Differentiable Physics

Add code
Aug 03, 2022
Figure 1 for Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Figure 2 for Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Figure 3 for Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Figure 4 for Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
Viaarxiv icon

Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation

Add code
Feb 23, 2022
Figure 1 for Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Figure 2 for Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Figure 3 for Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Figure 4 for Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Viaarxiv icon

Feature Generation for Long-tail Classification

Add code
Nov 10, 2021
Figure 1 for Feature Generation for Long-tail Classification
Figure 2 for Feature Generation for Long-tail Classification
Figure 3 for Feature Generation for Long-tail Classification
Figure 4 for Feature Generation for Long-tail Classification
Viaarxiv icon

Airbert: In-domain Pretraining for Vision-and-Language Navigation

Add code
Aug 20, 2021
Figure 1 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 2 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 3 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Figure 4 for Airbert: In-domain Pretraining for Vision-and-Language Navigation
Viaarxiv icon