Picture for Nanye Ma

Nanye Ma

Benchmarking Visual State Tracking in Multimodal Video Understanding

Add code
Jun 02, 2026
Viaarxiv icon

Mode Seeking meets Mean Seeking for Fast Long Video Generation

Add code
Feb 27, 2026
Viaarxiv icon

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Add code
Jan 22, 2026
Viaarxiv icon

Transition Matching Distillation for Fast Video Generation

Add code
Jan 14, 2026
Viaarxiv icon

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Add code
Jan 16, 2025
Figure 1 for Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
Figure 2 for Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
Figure 3 for Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
Figure 4 for Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
Viaarxiv icon

SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers

Add code
Jan 16, 2024
Viaarxiv icon