Picture for Aishwarya Agrawal

Aishwarya Agrawal

From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs

Add code
May 04, 2026
Viaarxiv icon

Discovering Failure Modes in Vision-Language Models using RL

Add code
Apr 06, 2026
Viaarxiv icon

Communicating about Space: Language-Mediated Spatial Integration Across Partial Views

Add code
Apr 01, 2026
Viaarxiv icon

CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics

Add code
Jun 10, 2025
Viaarxiv icon

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Add code
Mar 27, 2025
Figure 1 for CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Figure 2 for CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Figure 3 for CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Viaarxiv icon

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

Add code
Mar 19, 2025
Viaarxiv icon

Assessing and Learning Alignment of Unimodal Vision and Language Models

Add code
Dec 05, 2024
Figure 1 for Assessing and Learning Alignment of Unimodal Vision and Language Models
Figure 2 for Assessing and Learning Alignment of Unimodal Vision and Language Models
Figure 3 for Assessing and Learning Alignment of Unimodal Vision and Language Models
Figure 4 for Assessing and Learning Alignment of Unimodal Vision and Language Models
Viaarxiv icon

VisMin: Visual Minimal-Change Understanding

Add code
Jul 23, 2024
Figure 1 for VisMin: Visual Minimal-Change Understanding
Figure 2 for VisMin: Visual Minimal-Change Understanding
Figure 3 for VisMin: Visual Minimal-Change Understanding
Figure 4 for VisMin: Visual Minimal-Change Understanding
Viaarxiv icon

Benchmarking Vision Language Models for Cultural Understanding

Add code
Jul 15, 2024
Figure 1 for Benchmarking Vision Language Models for Cultural Understanding
Figure 2 for Benchmarking Vision Language Models for Cultural Understanding
Figure 3 for Benchmarking Vision Language Models for Cultural Understanding
Figure 4 for Benchmarking Vision Language Models for Cultural Understanding
Viaarxiv icon