Picture for Kushal Kafle

Kushal Kafle

Calibrating MLLM-as-a-judge via Multimodal Bayesian Prompt Ensembles

Add code
Sep 10, 2025
Viaarxiv icon

Plot'n Polish: Zero-shot Story Visualization and Disentangled Editing with Text-to-Image Diffusion Models

Add code
Sep 04, 2025
Viaarxiv icon

MS4UI: A Dataset for Multi-modal Summarization of User Interface Instructional Videos

Add code
Jun 14, 2025
Viaarxiv icon

MAGNET: Augmenting Generative Decoders with Representation Learning and Infilling Capabilities

Add code
Jan 15, 2025
Viaarxiv icon

Revisiting Multi-Modal LLM Evaluation

Add code
Aug 09, 2024
Figure 1 for Revisiting Multi-Modal LLM Evaluation
Figure 2 for Revisiting Multi-Modal LLM Evaluation
Figure 3 for Revisiting Multi-Modal LLM Evaluation
Figure 4 for Revisiting Multi-Modal LLM Evaluation
Viaarxiv icon

They're All Doctors: Synthesizing Diverse Counterfactuals to Mitigate Associative Bias

Add code
Jun 17, 2024
Viaarxiv icon

FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication

Add code
Apr 24, 2024
Figure 1 for FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication
Figure 2 for FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication
Figure 3 for FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication
Figure 4 for FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication
Viaarxiv icon

FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction

Add code
Apr 23, 2024
Figure 1 for FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Figure 2 for FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Figure 3 for FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Figure 4 for FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Viaarxiv icon

SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data

Add code
Aug 24, 2023
Viaarxiv icon

Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations

Add code
Jul 05, 2022
Figure 1 for Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
Figure 2 for Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
Figure 3 for Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
Figure 4 for Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
Viaarxiv icon