Picture for Dingcheng Zhen

Dingcheng Zhen

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

Add code
Oct 29, 2025
Viaarxiv icon

RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer

Add code
Aug 07, 2025
Viaarxiv icon

Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression

Add code
Jun 11, 2025
Viaarxiv icon