Picture for Tommaso Galliena

Tommaso Galliena

Memory-Augmented Vision-Language Agents for Persistent and Semantically Consistent Object Captioning

Add code
Mar 25, 2026
Viaarxiv icon

Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions

Add code
Apr 11, 2025
Figure 1 for Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
Figure 2 for Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
Figure 3 for Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
Figure 4 for Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
Viaarxiv icon