Video Alignment


Video alignment is the process of synchronizing or aligning multiple video sequences to create a coherent timeline or narrative.

BridgeV2W: Bridging Video Generation Models to Embodied World Models via Embodiment Masks

Add code
Feb 03, 2026
Viaarxiv icon

Conditional Flow Matching for Visually-Guided Acoustic Highlighting

Add code
Feb 03, 2026
Viaarxiv icon

Tiled Prompts: Overcoming Prompt Underspecification in Image and Video Super-Resolution

Add code
Feb 03, 2026
Viaarxiv icon

Omni-Judge: Can Omni-LLMs Serve as Human-Aligned Judges for Text-Conditioned Audio-Video Generation?

Add code
Feb 02, 2026
Viaarxiv icon

Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars

Add code
Feb 02, 2026
Viaarxiv icon

Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation

Add code
Feb 03, 2026
Viaarxiv icon

From Frames to Sequences: Temporally Consistent Human-Centric Dense Prediction

Add code
Feb 03, 2026
Viaarxiv icon

PISCES: Annotation-free Text-to-Video Post-Training via Optimal Transport-Aligned Rewards

Add code
Feb 02, 2026
Viaarxiv icon

InstaDrive: Instance-Aware Driving World Models for Realistic and Consistent Video Generation

Add code
Feb 03, 2026
Viaarxiv icon

3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation

Add code
Feb 03, 2026
Viaarxiv icon