Alert button
Picture for Andrey Kuzmin

Andrey Kuzmin

Alert button

GPTVQ: The Blessing of Dimensionality for LLM Quantization

Add code
Bookmark button
Alert button
Feb 23, 2024
Mart van Baalen, Andrey Kuzmin, Markus Nagel, Peter Couperus, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort, Paul Whatmough

Viaarxiv icon

Pruning vs Quantization: Which is Better?

Add code
Bookmark button
Alert button
Jul 06, 2023
Andrey Kuzmin, Markus Nagel, Mart van Baalen, Arash Behboodi, Tijmen Blankevoort

Figure 1 for Pruning vs Quantization: Which is Better?
Figure 2 for Pruning vs Quantization: Which is Better?
Figure 3 for Pruning vs Quantization: Which is Better?
Figure 4 for Pruning vs Quantization: Which is Better?
Viaarxiv icon

FP8 versus INT8 for efficient deep learning inference

Add code
Bookmark button
Alert button
Mar 31, 2023
Mart van Baalen, Andrey Kuzmin, Suparna S Nair, Yuwei Ren, Eric Mahurin, Chirag Patel, Sundar Subramanian, Sanghyuk Lee, Markus Nagel, Joseph Soriaga, Tijmen Blankevoort

Figure 1 for FP8 versus INT8 for efficient deep learning inference
Figure 2 for FP8 versus INT8 for efficient deep learning inference
Figure 3 for FP8 versus INT8 for efficient deep learning inference
Figure 4 for FP8 versus INT8 for efficient deep learning inference
Viaarxiv icon

FP8 Quantization: The Power of the Exponent

Add code
Bookmark button
Alert button
Aug 19, 2022
Andrey Kuzmin, Mart Van Baalen, Yuwei Ren, Markus Nagel, Jorn Peters, Tijmen Blankevoort

Figure 1 for FP8 Quantization: The Power of the Exponent
Figure 2 for FP8 Quantization: The Power of the Exponent
Figure 3 for FP8 Quantization: The Power of the Exponent
Figure 4 for FP8 Quantization: The Power of the Exponent
Viaarxiv icon

Quantized Sparse Weight Decomposition for Neural Network Compression

Add code
Bookmark button
Alert button
Jul 22, 2022
Andrey Kuzmin, Mart van Baalen, Markus Nagel, Arash Behboodi

Figure 1 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 2 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 3 for Quantized Sparse Weight Decomposition for Neural Network Compression
Figure 4 for Quantized Sparse Weight Decomposition for Neural Network Compression
Viaarxiv icon

Cyclical Pruning for Sparse Neural Networks

Add code
Bookmark button
Alert button
Feb 02, 2022
Suraj Srinivas, Andrey Kuzmin, Markus Nagel, Mart van Baalen, Andrii Skliar, Tijmen Blankevoort

Figure 1 for Cyclical Pruning for Sparse Neural Networks
Figure 2 for Cyclical Pruning for Sparse Neural Networks
Figure 3 for Cyclical Pruning for Sparse Neural Networks
Figure 4 for Cyclical Pruning for Sparse Neural Networks
Viaarxiv icon

Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks

Add code
Bookmark button
Alert button
Dec 20, 2019
Andrey Kuzmin, Markus Nagel, Saurabh Pitre, Sandeep Pendyam, Tijmen Blankevoort, Max Welling

Figure 1 for Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks
Figure 2 for Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks
Figure 3 for Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks
Figure 4 for Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks
Viaarxiv icon

End-to-end Learning of Cost-Volume Aggregation for Real-time Dense Stereo

Add code
Bookmark button
Alert button
Nov 17, 2016
Andrey Kuzmin, Dmitry Mikushin, Victor Lempitsky

Figure 1 for End-to-end Learning of Cost-Volume Aggregation for Real-time Dense Stereo
Figure 2 for End-to-end Learning of Cost-Volume Aggregation for Real-time Dense Stereo
Figure 3 for End-to-end Learning of Cost-Volume Aggregation for Real-time Dense Stereo
Figure 4 for End-to-end Learning of Cost-Volume Aggregation for Real-time Dense Stereo
Viaarxiv icon