Picture for Andrea Tupini

Andrea Tupini

AsgardBench - Evaluating Visually Grounded Interactive Planning Under Minimal Feedback

Add code
Mar 16, 2026
Viaarxiv icon

IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents

Add code
Jul 12, 2024
Figure 1 for IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents
Figure 2 for IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents
Figure 3 for IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents
Figure 4 for IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents
Viaarxiv icon