Picture for Yonggan Fu

Yonggan Fu

Celine

$R^2$-dLLM: Accelerating Diffusion Large Language Models via Spatio-Temporal Redundancy Reduction

Add code
Apr 21, 2026
Viaarxiv icon

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Apr 14, 2026
Viaarxiv icon

Fast-dVLM: Efficient Block-Diffusion VLM via Direct Conversion from Autoregressive VLM

Add code
Apr 08, 2026
Viaarxiv icon

NVIDIA Nemotron 3: Efficient and Open Intelligence

Add code
Dec 24, 2025
Viaarxiv icon

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Dec 23, 2025
Viaarxiv icon

Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed

Add code
Dec 16, 2025
Figure 1 for Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
Figure 2 for Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
Figure 3 for Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
Figure 4 for Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
Viaarxiv icon

TiDAR: Think in Diffusion, Talk in Autoregression

Add code
Nov 12, 2025
Figure 1 for TiDAR: Think in Diffusion, Talk in Autoregression
Figure 2 for TiDAR: Think in Diffusion, Talk in Autoregression
Figure 3 for TiDAR: Think in Diffusion, Talk in Autoregression
Figure 4 for TiDAR: Think in Diffusion, Talk in Autoregression
Viaarxiv icon

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon

Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment

Add code
Aug 08, 2025
Figure 1 for Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment
Figure 2 for Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment
Figure 3 for Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment
Figure 4 for Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment
Viaarxiv icon

LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field Enlargement

Add code
Apr 22, 2025
Viaarxiv icon