Picture for Chenghuan Huang

Chenghuan Huang

Beyond Few-Step Inference: Accelerating Video Diffusion Transformer Model Serving with Inter-Request Caching Reuse

Add code
Apr 06, 2026
Viaarxiv icon

Towards Efficient Multi-Scale Deformable Attention on NPU

Add code
May 20, 2025
Figure 1 for Towards Efficient Multi-Scale Deformable Attention on NPU
Figure 2 for Towards Efficient Multi-Scale Deformable Attention on NPU
Figure 3 for Towards Efficient Multi-Scale Deformable Attention on NPU
Figure 4 for Towards Efficient Multi-Scale Deformable Attention on NPU
Viaarxiv icon