Picture for Mohamed S. Abdelfattah

Mohamed S. Abdelfattah

Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion

Add code
May 27, 2025
Viaarxiv icon

Semantic Compression of 3D Objects for Open and Collaborative Virtual Worlds

Add code
May 22, 2025
Viaarxiv icon

SplitReason: Learning To Offload Reasoning

Add code
Apr 23, 2025
Viaarxiv icon

Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models

Add code
Mar 28, 2025
Viaarxiv icon

xKV: Cross-Layer SVD for KV-Cache Compression

Add code
Mar 24, 2025
Viaarxiv icon

TokenButler: Token Importance is Predictable

Add code
Mar 10, 2025
Viaarxiv icon

SparAMX: Accelerating Compressed LLMs Token Generation on AMX-powered CPUs

Add code
Feb 18, 2025
Viaarxiv icon

The Power of Negative Zero: Datatype Customization for Quantized Large Language Models

Add code
Jan 06, 2025
Figure 1 for The Power of Negative Zero: Datatype Customization for Quantized Large Language Models
Figure 2 for The Power of Negative Zero: Datatype Customization for Quantized Large Language Models
Figure 3 for The Power of Negative Zero: Datatype Customization for Quantized Large Language Models
Figure 4 for The Power of Negative Zero: Datatype Customization for Quantized Large Language Models
Viaarxiv icon

NITRO: LLM Inference on Intel Laptop NPUs

Add code
Dec 15, 2024
Figure 1 for NITRO: LLM Inference on Intel Laptop NPUs
Figure 2 for NITRO: LLM Inference on Intel Laptop NPUs
Figure 3 for NITRO: LLM Inference on Intel Laptop NPUs
Figure 4 for NITRO: LLM Inference on Intel Laptop NPUs
Viaarxiv icon

Attamba: Attending To Multi-Token States

Add code
Nov 26, 2024
Figure 1 for Attamba: Attending To Multi-Token States
Figure 2 for Attamba: Attending To Multi-Token States
Figure 3 for Attamba: Attending To Multi-Token States
Figure 4 for Attamba: Attending To Multi-Token States
Viaarxiv icon