Picture for Jiayue Pu

Jiayue Pu

HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios

Add code
Mar 12, 2026
Viaarxiv icon

MIRAGE-Bench: LLM Agent is Hallucinating and Where to Find Them

Add code
Jul 28, 2025
Figure 1 for MIRAGE-Bench: LLM Agent is Hallucinating and Where to Find Them
Figure 2 for MIRAGE-Bench: LLM Agent is Hallucinating and Where to Find Them
Figure 3 for MIRAGE-Bench: LLM Agent is Hallucinating and Where to Find Them
Figure 4 for MIRAGE-Bench: LLM Agent is Hallucinating and Where to Find Them
Viaarxiv icon