Picture for Markus Nagel

Markus Nagel

Sparse High Rank Adapters

Add code
Jun 19, 2024
Viaarxiv icon

Low-Rank Quantization-Aware Training for LLMs

Add code
Jun 10, 2024
Viaarxiv icon

GPTVQ: The Blessing of Dimensionality for LLM Quantization

Add code
Feb 23, 2024
Figure 1 for GPTVQ: The Blessing of Dimensionality for LLM Quantization
Figure 2 for GPTVQ: The Blessing of Dimensionality for LLM Quantization
Figure 3 for GPTVQ: The Blessing of Dimensionality for LLM Quantization
Figure 4 for GPTVQ: The Blessing of Dimensionality for LLM Quantization
Viaarxiv icon

The LLM Surgeon

Add code
Dec 28, 2023
Viaarxiv icon

MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device

Add code
Oct 02, 2023
Figure 1 for MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
Figure 2 for MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
Figure 3 for MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
Figure 4 for MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
Viaarxiv icon

Softmax Bias Correction for Quantized Generative Models

Add code
Sep 04, 2023
Figure 1 for Softmax Bias Correction for Quantized Generative Models
Figure 2 for Softmax Bias Correction for Quantized Generative Models
Figure 3 for Softmax Bias Correction for Quantized Generative Models
Figure 4 for Softmax Bias Correction for Quantized Generative Models
Viaarxiv icon

ResQ: Residual Quantization for Video Perception

Add code
Aug 18, 2023
Figure 1 for ResQ: Residual Quantization for Video Perception
Figure 2 for ResQ: Residual Quantization for Video Perception
Figure 3 for ResQ: Residual Quantization for Video Perception
Figure 4 for ResQ: Residual Quantization for Video Perception
Viaarxiv icon

QBitOpt: Fast and Accurate Bitwidth Reallocation during Training

Add code
Jul 10, 2023
Figure 1 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 2 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 3 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 4 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Viaarxiv icon

Pruning vs Quantization: Which is Better?

Add code
Jul 06, 2023
Figure 1 for Pruning vs Quantization: Which is Better?
Figure 2 for Pruning vs Quantization: Which is Better?
Figure 3 for Pruning vs Quantization: Which is Better?
Figure 4 for Pruning vs Quantization: Which is Better?
Viaarxiv icon

Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing

Add code
Jun 22, 2023
Figure 1 for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Figure 2 for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Figure 3 for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Figure 4 for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Viaarxiv icon