Video Similarity


XEmoGPT: An Explainable Multimodal Emotion Recognition Framework with Cue-Level Perception and Reasoning

Add code
Feb 05, 2026
Viaarxiv icon

A labeled dataset of simulated phlebotomy procedures for medical AI: polygon annotations for object detection and human-object interaction

Add code
Feb 04, 2026
Viaarxiv icon

AGILE: Hand-Object Interaction Reconstruction from Video via Agentic Generation

Add code
Feb 04, 2026
Viaarxiv icon

E.M.Ground: A Temporal Grounding Vid-LLM with Holistic Event Perception and Matching

Add code
Feb 05, 2026
Viaarxiv icon

Human-in-the-loop Adaptation in Group Activity Feature Learning for Team Sports Video Retrieval

Add code
Feb 03, 2026
Viaarxiv icon

KTV: Keyframes and Key Tokens Selection for Efficient Training-Free Video LLMs

Add code
Feb 03, 2026
Viaarxiv icon

Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars

Add code
Feb 02, 2026
Viaarxiv icon

Multi-Objective Optimization for Synthetic-to-Real Style Transfer

Add code
Feb 03, 2026
Viaarxiv icon

Unifying Watermarking via Dimension-Aware Mapping

Add code
Feb 03, 2026
Viaarxiv icon

VRGaussianAvatar: Integrating 3D Gaussian Avatars into VR

Add code
Feb 02, 2026
Viaarxiv icon