Picture for Joongwon Chae

Joongwon Chae

Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents

Add code
Dec 03, 2024
Figure 1 for Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents
Figure 2 for Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents
Figure 3 for Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents
Viaarxiv icon

SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection

Add code
Dec 03, 2024
Figure 1 for SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection
Figure 2 for SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection
Figure 3 for SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection
Figure 4 for SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection
Viaarxiv icon