Picture for Kurt Keutzer

Kurt Keutzer

XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization

Add code
Aug 14, 2025
Viaarxiv icon

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

Add code
Jun 24, 2025
Viaarxiv icon

Multipole Attention for Efficient Long Context Reasoning

Add code
Jun 16, 2025
Figure 1 for Multipole Attention for Efficient Long Context Reasoning
Figure 2 for Multipole Attention for Efficient Long Context Reasoning
Figure 3 for Multipole Attention for Efficient Long Context Reasoning
Figure 4 for Multipole Attention for Efficient Long Context Reasoning
Viaarxiv icon

R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation

Add code
Jun 09, 2025
Viaarxiv icon

GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control

Add code
May 29, 2025
Viaarxiv icon

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Add code
May 24, 2025
Viaarxiv icon

Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility

Add code
May 24, 2025
Viaarxiv icon

Learning Adaptive Parallel Reasoning with Language Models

Add code
Apr 21, 2025
Viaarxiv icon

FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference

Add code
Apr 19, 2025
Viaarxiv icon

Segment Any Motion in Videos

Add code
Mar 28, 2025
Viaarxiv icon