Alert button
Picture for Mart van Baalen

Mart van Baalen

Alert button

GPTVQ: The Blessing of Dimensionality for LLM Quantization

Add code
Bookmark button
Alert button
Feb 23, 2024
Mart van Baalen, Andrey Kuzmin, Markus Nagel, Peter Couperus, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort, Paul Whatmough

Viaarxiv icon

The LLM Surgeon

Add code
Bookmark button
Alert button
Dec 28, 2023
Tycho F. A. van der Ouderaa, Markus Nagel, Mart van Baalen, Yuki M. Asano, Tijmen Blankevoort

Viaarxiv icon

QBitOpt: Fast and Accurate Bitwidth Reallocation during Training

Add code
Bookmark button
Alert button
Jul 10, 2023
Jorn Peters, Marios Fournarakis, Markus Nagel, Mart van Baalen, Tijmen Blankevoort

Figure 1 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 2 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 3 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Figure 4 for QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Viaarxiv icon

Pruning vs Quantization: Which is Better?

Add code
Bookmark button
Alert button
Jul 06, 2023
Andrey Kuzmin, Markus Nagel, Mart van Baalen, Arash Behboodi, Tijmen Blankevoort

Figure 1 for Pruning vs Quantization: Which is Better?
Figure 2 for Pruning vs Quantization: Which is Better?
Figure 3 for Pruning vs Quantization: Which is Better?
Figure 4 for Pruning vs Quantization: Which is Better?
Viaarxiv icon

FP8 versus INT8 for efficient deep learning inference

Add code
Bookmark button
Alert button
Mar 31, 2023
Mart van Baalen, Andrey Kuzmin, Suparna S Nair, Yuwei Ren, Eric Mahurin, Chirag Patel, Sundar Subramanian, Sanghyuk Lee, Markus Nagel, Joseph Soriaga, Tijmen Blankevoort

Figure 1 for FP8 versus INT8 for efficient deep learning inference
Figure 2 for FP8 versus INT8 for efficient deep learning inference
Figure 3 for FP8 versus INT8 for efficient deep learning inference
Figure 4 for FP8 versus INT8 for efficient deep learning inference
Viaarxiv icon

A Practical Mixed Precision Algorithm for Post-Training Quantization

Add code
Bookmark button
Alert button
Feb 10, 2023
Nilesh Prasad Pandey, Markus Nagel, Mart van Baalen, Yin Huang, Chirag Patel, Tijmen Blankevoort

Figure 1 for A Practical Mixed Precision Algorithm for Post-Training Quantization
Figure 2 for A Practical Mixed Precision Algorithm for Post-Training Quantization
Figure 3 for A Practical Mixed Precision Algorithm for Post-Training Quantization
Figure 4 for A Practical Mixed Precision Algorithm for Post-Training Quantization
Viaarxiv icon

Quantized Sparse Weight Decomposition for Neural Network Compression

Add code
Bookmark button
Alert button
Jul 22, 2022
Andrey Kuzmin, Mart van Baalen, Markus Nagel, Arash Behboodi

Figure 1 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 2 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 3 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 4 for Quantized Sparse Weight Decomposition for Neural Network Compression
Viaarxiv icon

Cyclical Pruning for Sparse Neural Networks

Add code
Bookmark button
Alert button
Feb 02, 2022
Suraj Srinivas, Andrey Kuzmin, Markus Nagel, Mart van Baalen, Andrii Skliar, Tijmen Blankevoort

Figure 1 for Cyclical Pruning for Sparse Neural Networks
Figure 2 for Cyclical Pruning for Sparse Neural Networks
Figure 3 for Cyclical Pruning for Sparse Neural Networks
Figure 4 for Cyclical Pruning for Sparse Neural Networks
Viaarxiv icon

A White Paper on Neural Network Quantization

Add code
Bookmark button
Alert button
Jun 15, 2021
Markus Nagel, Marios Fournarakis, Rana Ali Amjad, Yelysei Bondarenko, Mart van Baalen, Tijmen Blankevoort

Figure 1 for A White Paper on Neural Network Quantization
Figure 2 for A White Paper on Neural Network Quantization
Figure 3 for A White Paper on Neural Network Quantization
Figure 4 for A White Paper on Neural Network Quantization
Viaarxiv icon

Bayesian Bits: Unifying Quantization and Pruning

Add code
Bookmark button
Alert button
May 15, 2020
Mart van Baalen, Christos Louizos, Markus Nagel, Rana Ali Amjad, Ying Wang, Tijmen Blankevoort, Max Welling

Figure 1 for Bayesian Bits: Unifying Quantization and Pruning
Figure 2 for Bayesian Bits: Unifying Quantization and Pruning
Figure 3 for Bayesian Bits: Unifying Quantization and Pruning
Figure 4 for Bayesian Bits: Unifying Quantization and Pruning
Viaarxiv icon