Picture for Amirreza Rouhi

Amirreza Rouhi

PRISM: A Multi-View Multi-Capability Retail Video Dataset for Embodied Vision-Language Models

Add code
Mar 31, 2026
Viaarxiv icon

ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations

Add code
Jun 10, 2025
Viaarxiv icon

Learning Scene Context Without Images

Add code
Nov 18, 2023
Viaarxiv icon