Hallucination Evaluation


Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs

Add code
Mar 26, 2026
Viaarxiv icon

GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents

Add code
Mar 25, 2026
Viaarxiv icon

HandVQA: Diagnosing and Improving Fine-Grained Spatial Reasoning about Hands in Vision-Language Models

Add code
Mar 27, 2026
Viaarxiv icon

Back to Basics: Revisiting ASR in the Age of Voice Agents

Add code
Mar 26, 2026
Viaarxiv icon

A Survey of OCR Evaluation Methods and Metrics and the Invisibility of Historical Documents

Add code
Mar 26, 2026
Viaarxiv icon

LogitScope: A Framework for Analyzing LLM Uncertainty Through Information Metrics

Add code
Mar 26, 2026
Viaarxiv icon

Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models

Add code
Mar 26, 2026
Viaarxiv icon

Visual Attention Drifts,but Anchors Hold:Mitigating Hallucination in Multimodal Large Language Models via Cross-Layer Visual Anchors

Add code
Mar 26, 2026
Viaarxiv icon

Mitigating Object Hallucinations in LVLMs via Attention Imbalance Rectification

Add code
Mar 25, 2026
Viaarxiv icon

From Intent to Evidence: A Categorical Approach for Structural Evaluation of Deep Research Agents

Add code
Mar 26, 2026
Viaarxiv icon