Picture for Katerina Fragkiadaki

Katerina Fragkiadaki

TAPIP3D: Tracking Any Point in Persistent 3D Geometry

Add code
Apr 20, 2025
Viaarxiv icon

Unified Multimodal Discrete Diffusion

Add code
Mar 26, 2025
Viaarxiv icon

Unifying 2D and 3D Vision-Language Understanding

Add code
Mar 13, 2025
Viaarxiv icon

Video Depth without Video Models

Add code
Nov 28, 2024
Viaarxiv icon

Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations

Add code
Aug 08, 2024
Figure 1 for Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Figure 2 for Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Figure 3 for Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Figure 4 for Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations
Viaarxiv icon

Video Diffusion Alignment via Reward Gradients

Add code
Jul 11, 2024
Viaarxiv icon

ICAL: Continual Learning of Multimodal Agents by Transforming Trajectories into Actionable Insights

Add code
Jun 20, 2024
Viaarxiv icon

DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

Add code
May 03, 2024
Figure 1 for DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Figure 2 for DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Figure 3 for DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Figure 4 for DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Viaarxiv icon

HELPER-X: A Unified Instructable Embodied Agent to Tackle Four Interactive Vision-Language Domains with Memory-Augmented Language Models

Add code
Apr 29, 2024
Viaarxiv icon

Tractable Joint Prediction and Planning over Discrete Behavior Modes for Urban Driving

Add code
Mar 12, 2024
Viaarxiv icon