Picture for Dingcheng Zhen

Dingcheng Zhen

SoulX-LiveAct: Towards Hour-Scale Real-Time Human Animation with Neighbor Forcing and ConvKV Memory

Add code
Mar 12, 2026
Viaarxiv icon

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

Add code
Oct 29, 2025
Viaarxiv icon

RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer

Add code
Aug 07, 2025
Viaarxiv icon

Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression

Add code
Jun 11, 2025
Figure 1 for Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
Figure 2 for Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
Figure 3 for Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
Figure 4 for Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
Viaarxiv icon