Alert button
Picture for Markus Nagel

Markus Nagel

Alert button

GPTVQ: The Blessing of Dimensionality for LLM Quantization

Add code
Bookmark button
Alert button
Feb 23, 2024
Mart van Baalen, Andrey Kuzmin, Markus Nagel, Peter Couperus, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort, Paul Whatmough

Viaarxiv icon

The LLM Surgeon

Add code
Bookmark button
Alert button
Dec 28, 2023
Tycho F. A. van der Ouderaa, Markus Nagel, Mart van Baalen, Yuki M. Asano, Tijmen Blankevoort

Viaarxiv icon

MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device

Add code
Bookmark button
Alert button
Oct 02, 2023
Ties van Rozendaal, Tushar Singhal, Hoang Le, Guillaume Sautiere, Amir Said, Krishna Buska, Anjuman Raha, Dimitris Kalatzis, Hitarth Mehta, Frank Mayer, Liang Zhang, Markus Nagel, Auke Wiggers

Figure 1 for MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
Figure 2 for MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
Figure 3 for MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
Figure 4 for MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
Viaarxiv icon

Softmax Bias Correction for Quantized Generative Models

Add code
Bookmark button
Alert button
Sep 04, 2023
Nilesh Prasad Pandey, Marios Fournarakis, Chirag Patel, Markus Nagel

Figure 1 for Softmax Bias Correction for Quantized Generative Models
Figure 2 for Softmax Bias Correction for Quantized Generative Models
Figure 3 for Softmax Bias Correction for Quantized Generative Models
Figure 4 for Softmax Bias Correction for Quantized Generative Models
Viaarxiv icon

ResQ: Residual Quantization for Video Perception

Add code
Bookmark button
Alert button
Aug 18, 2023
Davide Abati, Haitam Ben Yahia, Markus Nagel, Amirhossein Habibian

Figure 1 for ResQ: Residual Quantization for Video Perception
Figure 2 for ResQ: Residual Quantization for Video Perception
Figure 3 for ResQ: Residual Quantization for Video Perception
Figure 4 for ResQ: Residual Quantization for Video Perception
Viaarxiv icon

QBitOpt: Fast and Accurate Bitwidth Reallocation during Training

Add code
Bookmark button
Alert button
Jul 10, 2023
Jorn Peters, Marios Fournarakis, Markus Nagel, Mart van Baalen, Tijmen Blankevoort

Figure 1 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 2 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 3 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 4 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Viaarxiv icon

Pruning vs Quantization: Which is Better?

Add code
Bookmark button
Alert button
Jul 06, 2023
Andrey Kuzmin, Markus Nagel, Mart van Baalen, Arash Behboodi, Tijmen Blankevoort

Figure 1 for Pruning vs Quantization: Which is Better?
Figure 2 for Pruning vs Quantization: Which is Better?
Figure 3 for Pruning vs Quantization: Which is Better?
Figure 4 for Pruning vs Quantization: Which is Better?
Viaarxiv icon

Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing

Add code
Bookmark button
Alert button
Jun 22, 2023
Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort

Figure 1 for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Figure 2 for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Figure 3 for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Figure 4 for Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Viaarxiv icon

FP8 versus INT8 for efficient deep learning inference

Add code
Bookmark button
Alert button
Mar 31, 2023
Mart van Baalen, Andrey Kuzmin, Suparna S Nair, Yuwei Ren, Eric Mahurin, Chirag Patel, Sundar Subramanian, Sanghyuk Lee, Markus Nagel, Joseph Soriaga, Tijmen Blankevoort

Figure 1 for FP8 versus INT8 for efficient deep learning inference
Figure 2 for FP8 versus INT8 for efficient deep learning inference
Figure 3 for FP8 versus INT8 for efficient deep learning inference
Figure 4 for FP8 versus INT8 for efficient deep learning inference
Viaarxiv icon

A Practical Mixed Precision Algorithm for Post-Training Quantization

Add code
Bookmark button
Alert button
Feb 10, 2023
Nilesh Prasad Pandey, Markus Nagel, Mart van Baalen, Yin Huang, Chirag Patel, Tijmen Blankevoort

Figure 1 for A Practical Mixed Precision Algorithm for Post-Training Quantization
Figure 2 for A Practical Mixed Precision Algorithm for Post-Training Quantization
Figure 3 for A Practical Mixed Precision Algorithm for Post-Training Quantization
Figure 4 for A Practical Mixed Precision Algorithm for Post-Training Quantization
Viaarxiv icon