Video Segmentation


A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects

Add code
Jun 16, 2025
Viaarxiv icon

StgcDiff: Spatial-Temporal Graph Condition Diffusion for Sign Language Transition Generation

Add code
Jun 16, 2025
Viaarxiv icon

MS4UI: A Dataset for Multi-modal Summarization of User Interface Instructional Videos

Add code
Jun 14, 2025
Viaarxiv icon

Touch begins where vision ends: Generalizable policies for contact-rich manipulation

Add code
Jun 16, 2025
Viaarxiv icon

MAMMA: Markerless & Automatic Multi-Person Motion Action Capture

Add code
Jun 16, 2025
Viaarxiv icon

Can Sound Replace Vision in LLaVA With Token Substitution?

Add code
Jun 12, 2025
Viaarxiv icon

Prompts to Summaries: Zero-Shot Language-Guided Video Summarization

Add code
Jun 12, 2025
Viaarxiv icon

Outside Knowledge Conversational Video (OKCV) Dataset -- Dialoguing over Videos

Add code
Jun 11, 2025
Viaarxiv icon

Self-supervised Learning of Echocardiographic Video Representations via Online Cluster Distillation

Add code
Jun 13, 2025
Viaarxiv icon

Q-SAM2: Accurate Quantization for Segment Anything Model 2

Add code
Jun 11, 2025
Viaarxiv icon