Picture for Georgia Gkioxari

Georgia Gkioxari

Out of Sight, Out of Mind? Evaluating State Evolution in Video World Models

Add code
Mar 13, 2026
Viaarxiv icon

Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision

Add code
Feb 13, 2026
Viaarxiv icon

Linear Mechanisms for Spatiotemporal Reasoning in Vision Language Models

Add code
Jan 18, 2026
Viaarxiv icon

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Add code
Jan 04, 2026
Viaarxiv icon

Same or Not? Enhancing Visual Perception in Vision-Language Models

Add code
Dec 29, 2025
Viaarxiv icon

Feedforward 3D Editing via Text-Steerable Image-to-3D

Add code
Dec 15, 2025
Viaarxiv icon

No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers

Add code
Dec 09, 2025
Viaarxiv icon

Is This Tracker On? A Benchmark Protocol for Dynamic Tracking

Add code
Oct 22, 2025
Viaarxiv icon

NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models

Add code
Jun 09, 2025
Viaarxiv icon

Aligning Text, Images, and 3D Structure Token-by-Token

Add code
Jun 09, 2025
Figure 1 for Aligning Text, Images, and 3D Structure Token-by-Token
Figure 2 for Aligning Text, Images, and 3D Structure Token-by-Token
Figure 3 for Aligning Text, Images, and 3D Structure Token-by-Token
Figure 4 for Aligning Text, Images, and 3D Structure Token-by-Token
Viaarxiv icon