Alert button
Picture for Muhammad Maaz

Muhammad Maaz

Alert button

PALO: A Polyglot Large Multimodal Model for 5B People

Add code
Bookmark button
Alert button
Mar 05, 2024
Muhammad Maaz, Hanoona Rasheed, Abdelrahman Shaker, Salman Khan, Hisham Cholakal, Rao M. Anwer, Tim Baldwin, Michael Felsberg, Fahad S. Khan

Viaarxiv icon

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

Add code
Bookmark button
Alert button
Nov 22, 2023
Shehan Munasinghe, Rusiru Thushara, Muhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Mubarak Shah, Fahad Khan

Figure 1 for PG-Video-LLaVA: Pixel Grounding Large Video-Language Models
Figure 2 for PG-Video-LLaVA: Pixel Grounding Large Video-Language Models
Figure 3 for PG-Video-LLaVA: Pixel Grounding Large Video-Language Models
Figure 4 for PG-Video-LLaVA: Pixel Grounding Large Video-Language Models
Viaarxiv icon

GLaMM: Pixel Grounding Large Multimodal Model

Add code
Bookmark button
Alert button
Nov 06, 2023
Hanoona Rasheed, Muhammad Maaz, Sahal Shaji, Abdelrahman Shaker, Salman Khan, Hisham Cholakkal, Rao M. Anwer, Erix Xing, Ming-Hsuan Yang, Fahad S. Khan

Viaarxiv icon

On Orderings of Probability Vectors and Unsupervised Performance Estimation

Add code
Bookmark button
Alert button
Jun 16, 2023
Muhammad Maaz, Rui Qiao, Yiheng Zhou, Renxian Zhang

Figure 1 for On Orderings of Probability Vectors and Unsupervised Performance Estimation
Figure 2 for On Orderings of Probability Vectors and Unsupervised Performance Estimation
Figure 3 for On Orderings of Probability Vectors and Unsupervised Performance Estimation
Figure 4 for On Orderings of Probability Vectors and Unsupervised Performance Estimation
Viaarxiv icon

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

Add code
Bookmark button
Alert button
Jun 08, 2023
Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan

Figure 1 for Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Figure 2 for Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Figure 3 for Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Figure 4 for Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Viaarxiv icon

SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications

Add code
Bookmark button
Alert button
Mar 27, 2023
Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan

Figure 1 for SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Figure 2 for SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Figure 3 for SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Figure 4 for SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Viaarxiv icon

UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation

Add code
Bookmark button
Alert button
Dec 08, 2022
Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan

Figure 1 for UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Figure 2 for UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Figure 3 for UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Figure 4 for UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Viaarxiv icon

Fine-tuned CLIP Models are Efficient Video Learners

Add code
Bookmark button
Alert button
Dec 06, 2022
Hanoona Rasheed, Muhammad Uzair Khattak, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan

Figure 1 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 2 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 3 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 4 for Fine-tuned CLIP Models are Efficient Video Learners
Viaarxiv icon

MaPLe: Multi-modal Prompt Learning

Add code
Bookmark button
Alert button
Oct 06, 2022
Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan

Figure 1 for MaPLe: Multi-modal Prompt Learning
Figure 2 for MaPLe: Multi-modal Prompt Learning
Figure 3 for MaPLe: Multi-modal Prompt Learning
Figure 4 for MaPLe: Multi-modal Prompt Learning
Viaarxiv icon