Picture for Muhammad Uzair Khattak

Muhammad Uzair Khattak

How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Add code
May 08, 2024
Figure 1 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 2 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 3 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 4 for How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Viaarxiv icon

Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Add code
May 06, 2024
Figure 1 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 2 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 3 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Figure 4 for Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs
Viaarxiv icon

Learning to Prompt with Text Only Supervision for Vision-Language Models

Add code
Jan 04, 2024
Viaarxiv icon

Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization

Add code
Nov 02, 2023
Figure 1 for Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization
Figure 2 for Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization
Figure 3 for Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization
Figure 4 for Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization
Viaarxiv icon

Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition

Add code
Jul 16, 2023
Figure 1 for Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Figure 2 for Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Figure 3 for Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Figure 4 for Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Viaarxiv icon

Self-regulating Prompts: Foundational Model Adaptation without Forgetting

Add code
Jul 13, 2023
Figure 1 for Self-regulating Prompts: Foundational Model Adaptation without Forgetting
Figure 2 for Self-regulating Prompts: Foundational Model Adaptation without Forgetting
Figure 3 for Self-regulating Prompts: Foundational Model Adaptation without Forgetting
Figure 4 for Self-regulating Prompts: Foundational Model Adaptation without Forgetting
Viaarxiv icon

Fine-tuned CLIP Models are Efficient Video Learners

Add code
Dec 06, 2022
Figure 1 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 2 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 3 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 4 for Fine-tuned CLIP Models are Efficient Video Learners
Viaarxiv icon

MaPLe: Multi-modal Prompt Learning

Add code
Oct 06, 2022
Figure 1 for MaPLe: Multi-modal Prompt Learning
Figure 2 for MaPLe: Multi-modal Prompt Learning
Figure 3 for MaPLe: Multi-modal Prompt Learning
Figure 4 for MaPLe: Multi-modal Prompt Learning
Viaarxiv icon

Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection

Add code
Jul 07, 2022
Figure 1 for Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Figure 2 for Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Figure 3 for Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Figure 4 for Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Viaarxiv icon