Alert button
Picture for Hanoona Rasheed

Hanoona Rasheed

Alert button

PALO: A Polyglot Large Multimodal Model for 5B People

Add code
Bookmark button
Alert button
Mar 05, 2024
Muhammad Maaz, Hanoona Rasheed, Abdelrahman Shaker, Salman Khan, Hisham Cholakal, Rao M. Anwer, Tim Baldwin, Michael Felsberg, Fahad S. Khan

Viaarxiv icon

GLaMM: Pixel Grounding Large Multimodal Model

Add code
Bookmark button
Alert button
Nov 06, 2023
Hanoona Rasheed, Muhammad Maaz, Sahal Shaji, Abdelrahman Shaker, Salman Khan, Hisham Cholakkal, Rao M. Anwer, Erix Xing, Ming-Hsuan Yang, Fahad S. Khan

Viaarxiv icon

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

Add code
Bookmark button
Alert button
Jun 08, 2023
Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan

Figure 1 for Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Figure 2 for Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Figure 3 for Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Figure 4 for Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Viaarxiv icon

SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications

Add code
Bookmark button
Alert button
Mar 27, 2023
Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan

Figure 1 for SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Figure 2 for SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Figure 3 for SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Figure 4 for SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Viaarxiv icon

UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation

Add code
Bookmark button
Alert button
Dec 08, 2022
Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan

Figure 1 for UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Figure 2 for UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Figure 3 for UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Figure 4 for UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation
Viaarxiv icon

Fine-tuned CLIP Models are Efficient Video Learners

Add code
Bookmark button
Alert button
Dec 06, 2022
Hanoona Rasheed, Muhammad Uzair Khattak, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan

Figure 1 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 2 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 3 for Fine-tuned CLIP Models are Efficient Video Learners
Figure 4 for Fine-tuned CLIP Models are Efficient Video Learners
Viaarxiv icon

MaPLe: Multi-modal Prompt Learning

Add code
Bookmark button
Alert button
Oct 06, 2022
Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan

Figure 1 for MaPLe: Multi-modal Prompt Learning
Figure 2 for MaPLe: Multi-modal Prompt Learning
Figure 3 for MaPLe: Multi-modal Prompt Learning
Figure 4 for MaPLe: Multi-modal Prompt Learning
Viaarxiv icon

Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection

Add code
Bookmark button
Alert button
Jul 07, 2022
Hanoona Rasheed, Muhammad Maaz, Muhammad Uzair Khattak, Salman Khan, Fahad Shahbaz Khan

Figure 1 for Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Figure 2 for Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Figure 3 for Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Figure 4 for Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Viaarxiv icon

Multi-modal Transformers Excel at Class-agnostic Object Detection

Add code
Bookmark button
Alert button
Nov 22, 2021
Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang

Figure 1 for Multi-modal Transformers Excel at Class-agnostic Object Detection
Figure 2 for Multi-modal Transformers Excel at Class-agnostic Object Detection
Figure 3 for Multi-modal Transformers Excel at Class-agnostic Object Detection
Figure 4 for Multi-modal Transformers Excel at Class-agnostic Object Detection
Viaarxiv icon