Picture for Muzammal Naseer

Muzammal Naseer

Multi-Granularity Language-Guided Multi-Object Tracking

Add code
Jun 07, 2024
Figure 1 for Multi-Granularity Language-Guided Multi-Object Tracking
Figure 2 for Multi-Granularity Language-Guided Multi-Object Tracking
Figure 3 for Multi-Granularity Language-Guided Multi-Object Tracking
Figure 4 for Multi-Granularity Language-Guided Multi-Object Tracking
Viaarxiv icon

Multi-modal Generation via Cross-Modal In-Context Learning

Add code
May 28, 2024
Viaarxiv icon

How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Add code
May 08, 2024
Figure 1 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 2 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 3 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 4 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Viaarxiv icon

Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Add code
May 06, 2024
Figure 1 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 2 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 3 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 4 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Viaarxiv icon

Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels

Add code
Apr 15, 2024
Figure 1 for Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Figure 2 for Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Figure 3 for Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Figure 4 for Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Viaarxiv icon

Language Guided Domain Generalized Medical Image Segmentation

Add code
Apr 03, 2024
Viaarxiv icon

VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding

Add code
Mar 25, 2024
Viaarxiv icon

Composed Video Retrieval via Enriched Context and Discriminative Embeddings

Add code
Mar 25, 2024
Figure 1 for Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Figure 2 for Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Figure 3 for Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Figure 4 for Composed Video Retrieval via Enriched Context and Discriminative Embeddings
Viaarxiv icon

Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning

Add code
Mar 21, 2024
Viaarxiv icon

ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes

Add code
Mar 15, 2024
Viaarxiv icon