Alert button
Picture for Zaid Khan

Zaid Khan

Alert button

Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering

Add code
Bookmark button
Alert button
Apr 16, 2024
Zaid Khan, Yun Fu

Viaarxiv icon

Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement

Add code
Bookmark button
Alert button
Apr 06, 2024
Zaid Khan, Vijay Kumar BG, Samuel Schulter, Yun Fu, Manmohan Chandraker

Viaarxiv icon

Exploring Question Decomposition for Zero-Shot VQA

Add code
Bookmark button
Alert button
Oct 25, 2023
Zaid Khan, Vijay Kumar BG, Samuel Schulter, Manmohan Chandraker, Yun Fu

Viaarxiv icon

Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!

Add code
Bookmark button
Alert button
Jun 06, 2023
Zaid Khan, Vijay Kumar BG, Samuel Schulter, Xiang Yu, Yun Fu, Manmohan Chandraker

Figure 1 for Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Figure 2 for Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Figure 3 for Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Figure 4 for Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Viaarxiv icon

Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning

Add code
Bookmark button
Alert button
Mar 21, 2023
Zaid Khan, Yun Fu

Figure 1 for Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning
Figure 2 for Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning
Figure 3 for Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning
Figure 4 for Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning
Viaarxiv icon

Single-Stream Multi-Level Alignment for Vision-Language Pretraining

Add code
Bookmark button
Alert button
Mar 30, 2022
Zaid Khan, Vijay Kumar BG, Xiang Yu, Samuel Schulter, Manmohan Chandraker, Yun Fu

Figure 1 for Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Figure 2 for Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Figure 3 for Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Figure 4 for Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Viaarxiv icon

Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation

Add code
Bookmark button
Alert button
Aug 05, 2021
Zaid Khan, Yun Fu

Figure 1 for Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation
Figure 2 for Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation
Figure 3 for Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation
Figure 4 for Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation
Viaarxiv icon

Exploiting BERT For Multimodal Target SentimentClassification Through Input Space Translation

Add code
Bookmark button
Alert button
Aug 03, 2021
Zaid Khan, Yun Fu

Figure 1 for Exploiting BERT For Multimodal Target SentimentClassification Through Input Space Translation
Figure 2 for Exploiting BERT For Multimodal Target SentimentClassification Through Input Space Translation
Figure 3 for Exploiting BERT For Multimodal Target SentimentClassification Through Input Space Translation
Figure 4 for Exploiting BERT For Multimodal Target SentimentClassification Through Input Space Translation
Viaarxiv icon

One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision

Add code
Bookmark button
Alert button
Feb 03, 2021
Zaid Khan, Yun Fu

Figure 1 for One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision
Figure 2 for One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision
Figure 3 for One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision
Figure 4 for One Label, One Billion Faces: Usage and Consistency of Racial Categories in Computer Vision
Viaarxiv icon

Families In Wild Multimedia (FIW-MM): A Multi-Modal Database for Recognizing Kinship

Add code
Bookmark button
Alert button
Jul 28, 2020
Joseph P. Robinson, Zaid Khan, Yu Yin, Ming Shao, Yun Fu

Figure 1 for Families In Wild Multimedia (FIW-MM): A Multi-Modal Database for Recognizing Kinship
Figure 2 for Families In Wild Multimedia (FIW-MM): A Multi-Modal Database for Recognizing Kinship
Figure 3 for Families In Wild Multimedia (FIW-MM): A Multi-Modal Database for Recognizing Kinship
Figure 4 for Families In Wild Multimedia (FIW-MM): A Multi-Modal Database for Recognizing Kinship
Viaarxiv icon