Picture for Xiaodong Cun

Xiaodong Cun

Seg-Agent: Test-Time Multimodal Reasoning for Training-Free Language-Guided Segmentation

Add code
May 13, 2026
Viaarxiv icon

Beyond Text Prompts: Visual-to-Visual Generation as A Unified Paradigm

Add code
May 12, 2026
Viaarxiv icon

CutClaw: Agentic Hours-Long Video Editing via Music Synchronization

Add code
Mar 31, 2026
Viaarxiv icon

LightCtrl: Training-free Controllable Video Relighting

Add code
Mar 28, 2026
Viaarxiv icon

MLLM-4D: Towards Visual-based Spatial-Temporal Intelligence

Add code
Feb 28, 2026
Viaarxiv icon

EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition

Add code
Dec 26, 2025
Figure 1 for EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition
Figure 2 for EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition
Figure 3 for EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition
Figure 4 for EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition
Viaarxiv icon

PersonaLive! Expressive Portrait Image Animation for Live Streaming

Add code
Dec 12, 2025
Figure 1 for PersonaLive! Expressive Portrait Image Animation for Live Streaming
Figure 2 for PersonaLive! Expressive Portrait Image Animation for Live Streaming
Figure 3 for PersonaLive! Expressive Portrait Image Animation for Live Streaming
Figure 4 for PersonaLive! Expressive Portrait Image Animation for Live Streaming
Viaarxiv icon

MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis

Add code
Oct 08, 2025
Figure 1 for MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis
Figure 2 for MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis
Figure 3 for MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis
Figure 4 for MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis
Viaarxiv icon

VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning

Add code
May 29, 2025
Viaarxiv icon

Sci-Fi: Symmetric Constraint for Frame Inbetweening

Add code
May 27, 2025
Figure 1 for Sci-Fi: Symmetric Constraint for Frame Inbetweening
Figure 2 for Sci-Fi: Symmetric Constraint for Frame Inbetweening
Figure 3 for Sci-Fi: Symmetric Constraint for Frame Inbetweening
Figure 4 for Sci-Fi: Symmetric Constraint for Frame Inbetweening
Viaarxiv icon