Picture for Salman Khan

Salman Khan

How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Add code
May 08, 2024
Figure 1 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 2 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 3 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 4 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Viaarxiv icon

Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Add code
May 06, 2024
Figure 1 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 2 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 3 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 4 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Viaarxiv icon

Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning

Add code
Apr 23, 2024
Figure 1 for Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning
Figure 2 for Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning
Figure 3 for Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning
Figure 4 for Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning
Viaarxiv icon

Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels

Add code
Apr 15, 2024
Figure 1 for Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Figure 2 for Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Figure 3 for Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Figure 4 for Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Viaarxiv icon

Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning

Add code
Apr 11, 2024
Viaarxiv icon

Language Guided Domain Generalized Medical Image Segmentation

Add code
Apr 03, 2024
Figure 1 for Language Guided Domain Generalized Medical Image Segmentation
Figure 2 for Language Guided Domain Generalized Medical Image Segmentation
Figure 3 for Language Guided Domain Generalized Medical Image Segmentation
Figure 4 for Language Guided Domain Generalized Medical Image Segmentation
Viaarxiv icon

Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration

Add code
Apr 02, 2024
Figure 1 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 2 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 3 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 4 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Viaarxiv icon

ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection

Add code
Mar 26, 2024
Viaarxiv icon

Efficient Video Object Segmentation via Modulated Cross-Attention Memory

Add code
Mar 26, 2024
Figure 1 for Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Figure 2 for Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Figure 3 for Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Figure 4 for Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Viaarxiv icon

VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding

Add code
Mar 25, 2024
Viaarxiv icon