Factual Visual Question Answering


mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation

Add code
May 29, 2025
Viaarxiv icon

VIGNETTE: Socially Grounded Bias Evaluation for Vision-Language Models

Add code
May 28, 2025
Viaarxiv icon

Traveling Across Languages: Benchmarking Cross-Lingual Consistency in Multimodal LLMs

Add code
May 21, 2025
Viaarxiv icon

Pixels Versus Priors: Controlling Knowledge Priors in Vision-Language Models through Visual Counterfacts

Add code
May 21, 2025
Viaarxiv icon

ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification

Add code
Apr 29, 2025
Viaarxiv icon

See or Recall: A Sanity Check for the Role of Vision in Solving Visualization Question Answer Tasks with Multimodal LLMs

Add code
Apr 14, 2025
Viaarxiv icon

AgMMU: A Comprehensive Agricultural Multimodal Understanding and Reasoning Benchmark

Add code
Apr 14, 2025
Viaarxiv icon

VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge

Add code
Apr 15, 2025
Viaarxiv icon

ChineseSimpleVQA -- "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

Add code
Feb 19, 2025
Viaarxiv icon

VisualSimpleQA: A Benchmark for Decoupled Evaluation of Large Vision-Language Models in Fact-Seeking Question Answering

Add code
Mar 09, 2025
Viaarxiv icon