Picture for Mubarak Shah

Mubarak Shah

VidLA: Video-Language Alignment at Scale

Add code
Mar 21, 2024
Viaarxiv icon

FSViewFusion: Few-Shots View Generation of Novel Objects

Add code
Mar 13, 2024
Figure 1 for FSViewFusion: Few-Shots View Generation of Novel Objects
Figure 2 for FSViewFusion: Few-Shots View Generation of Novel Objects
Figure 3 for FSViewFusion: Few-Shots View Generation of Novel Objects
Figure 4 for FSViewFusion: Few-Shots View Generation of Novel Objects
Viaarxiv icon

CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes

Add code
Feb 16, 2024
Figure 1 for CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes
Figure 2 for CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes
Figure 3 for CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes
Figure 4 for CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes
Viaarxiv icon

No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

Add code
Dec 20, 2023
Viaarxiv icon

DVANet: Disentangling View and Action Features for Multi-View Action Recognition

Add code
Dec 10, 2023
Viaarxiv icon

Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

Add code
Dec 07, 2023
Figure 1 for Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?
Figure 2 for Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?
Figure 3 for Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?
Figure 4 for Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?
Viaarxiv icon

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

Add code
Nov 22, 2023
Viaarxiv icon

Videoprompter: an ensemble of foundational models for zero-shot video understanding

Add code
Oct 23, 2023
Figure 1 for Videoprompter: an ensemble of foundational models for zero-shot video understanding
Figure 2 for Videoprompter: an ensemble of foundational models for zero-shot video understanding
Figure 3 for Videoprompter: an ensemble of foundational models for zero-shot video understanding
Figure 4 for Videoprompter: an ensemble of foundational models for zero-shot video understanding
Viaarxiv icon

GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization

Add code
Sep 27, 2023
Figure 1 for GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
Figure 2 for GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
Figure 3 for GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
Figure 4 for GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
Viaarxiv icon

Egocentric RGB+Depth Action Recognition in Industry-Like Settings

Add code
Sep 25, 2023
Figure 1 for Egocentric RGB+Depth Action Recognition in Industry-Like Settings
Figure 2 for Egocentric RGB+Depth Action Recognition in Industry-Like Settings
Figure 3 for Egocentric RGB+Depth Action Recognition in Industry-Like Settings
Figure 4 for Egocentric RGB+Depth Action Recognition in Industry-Like Settings
Viaarxiv icon