Video Similarity


PrismVAU: Prompt-Refined Inference System for Multimodal Video Anomaly Understanding

Add code
Jan 07, 2026
Viaarxiv icon

Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Add code
Jan 08, 2026
Viaarxiv icon

PipeFlow: Pipelined Processing and Motion-Aware Frame Selection for Long-Form Video Editing

Add code
Dec 30, 2025
Viaarxiv icon

e5-omni: Explicit Cross-modal Alignment for Omni-modal Embeddings

Add code
Jan 07, 2026
Viaarxiv icon

FastV-RAG: Towards Fast and Fine-Grained Video QA with Retrieval-Augmented Generation

Add code
Jan 07, 2026
Viaarxiv icon

Emergence of Human to Robot Transfer in Vision-Language-Action Models

Add code
Dec 27, 2025
Viaarxiv icon

LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Add code
Dec 29, 2025
Viaarxiv icon

Kinematic-Based Assessment of Surgical Actions in Microanastomosis

Add code
Dec 30, 2025
Viaarxiv icon

TV-RAG: A Temporal-aware and Semantic Entropy-Weighted Framework for Long Video Retrieval and Understanding

Add code
Dec 29, 2025
Viaarxiv icon

Efficient and Robust Video Defense Framework against 3D-field Personalized Talking Face

Add code
Dec 24, 2025
Viaarxiv icon