Picture for Parsa Madinei

Parsa Madinei

Revealing the Gap in Human and VLM Scene Perception through Counterfactual Semantic Saliency

Add code
May 13, 2026
Viaarxiv icon

IRIS: Intent Resolution via Inference-time Saccades for Open-Ended VQA in Large Vision-Language Models

Add code
Feb 18, 2026
Viaarxiv icon

ARChef: An iOS-Based Augmented Reality Cooking Assistant Powered by Multimodal Gemini LLM

Add code
Dec 01, 2024
Viaarxiv icon