Picture for Gedas Bertasius

Gedas Bertasius

TeDiO: Temporal Diagonal Optimization for Training-Free Coherent Video Diffusion

Add code
May 13, 2026
Viaarxiv icon

EgoMemReason: A Memory-Driven Reasoning Benchmark for Long-Horizon Egocentric Video Understanding

Add code
May 11, 2026
Viaarxiv icon

V2M-Zero: Zero-Pair Time-Aligned Video-to-Music Generation

Add code
Mar 11, 2026
Viaarxiv icon

LiLo-VLA: Compositional Long-Horizon Manipulation via Linked Object-Centric Policies

Add code
Feb 25, 2026
Viaarxiv icon

TimeBlind: A Spatio-Temporal Compositionality Benchmark for Video LLMs

Add code
Jan 30, 2026
Viaarxiv icon

DocSLM: A Small Vision-Language Model for Long Multimodal Document Understanding

Add code
Nov 17, 2025
Figure 1 for DocSLM: A Small Vision-Language Model for Long Multimodal Document Understanding
Figure 2 for DocSLM: A Small Vision-Language Model for Long Multimodal Document Understanding
Figure 3 for DocSLM: A Small Vision-Language Model for Long Multimodal Document Understanding
Figure 4 for DocSLM: A Small Vision-Language Model for Long Multimodal Document Understanding
Viaarxiv icon

Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning

Add code
Jul 09, 2025
Viaarxiv icon

ExAct: A Video-Language Benchmark for Expert Action Analysis

Add code
Jun 06, 2025
Viaarxiv icon

SiLVR: A Simple Language-based Video Reasoning Framework

Add code
May 30, 2025
Viaarxiv icon

BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation

Add code
Mar 26, 2025
Viaarxiv icon