Vid Dataset


E.M.Ground: A Temporal Grounding Vid-LLM with Holistic Event Perception and Matching

Add code
Feb 05, 2026
Viaarxiv icon

PointSt3R: Point Tracking through 3D Grounded Correspondence

Add code
Oct 30, 2025
Figure 1 for PointSt3R: Point Tracking through 3D Grounded Correspondence
Figure 2 for PointSt3R: Point Tracking through 3D Grounded Correspondence
Figure 3 for PointSt3R: Point Tracking through 3D Grounded Correspondence
Figure 4 for PointSt3R: Point Tracking through 3D Grounded Correspondence
Viaarxiv icon

Detecting and Mitigating Insertion Hallucination in Video-to-Audio Generation

Add code
Oct 09, 2025
Viaarxiv icon

CI-VID: A Coherent Interleaved Text-Video Dataset

Add code
Jul 02, 2025
Viaarxiv icon

Leveraging Pre-Trained Visual Models for AI-Generated Video Detection

Add code
Jul 17, 2025
Figure 1 for Leveraging Pre-Trained Visual Models for AI-Generated Video Detection
Figure 2 for Leveraging Pre-Trained Visual Models for AI-Generated Video Detection
Figure 3 for Leveraging Pre-Trained Visual Models for AI-Generated Video Detection
Figure 4 for Leveraging Pre-Trained Visual Models for AI-Generated Video Detection
Viaarxiv icon

Learning to Track Any Points from Human Motion

Add code
Jul 08, 2025
Figure 1 for Learning to Track Any Points from Human Motion
Figure 2 for Learning to Track Any Points from Human Motion
Figure 3 for Learning to Track Any Points from Human Motion
Figure 4 for Learning to Track Any Points from Human Motion
Viaarxiv icon

RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements

Add code
Apr 11, 2025
Figure 1 for RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
Figure 2 for RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
Figure 3 for RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
Viaarxiv icon

CogStream: Context-guided Streaming Video Question Answering

Add code
Jun 12, 2025
Figure 1 for CogStream: Context-guided Streaming Video Question Answering
Figure 2 for CogStream: Context-guided Streaming Video Question Answering
Figure 3 for CogStream: Context-guided Streaming Video Question Answering
Figure 4 for CogStream: Context-guided Streaming Video Question Answering
Viaarxiv icon

TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection

Add code
Mar 18, 2025
Viaarxiv icon

VEU-Bench: Towards Comprehensive Understanding of Video Editing

Add code
Apr 24, 2025
Figure 1 for VEU-Bench: Towards Comprehensive Understanding of Video Editing
Figure 2 for VEU-Bench: Towards Comprehensive Understanding of Video Editing
Figure 3 for VEU-Bench: Towards Comprehensive Understanding of Video Editing
Figure 4 for VEU-Bench: Towards Comprehensive Understanding of Video Editing
Viaarxiv icon