Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Adaptive Loss-aware Quantization for Multi-bit Networks

Dec 18, 2019

Zhongnan Qu, Zimu Zhou, Yun Cheng, Lothar Thiele

Figure 1 for Adaptive Loss-aware Quantization for Multi-bit Networks

Figure 2 for Adaptive Loss-aware Quantization for Multi-bit Networks

Figure 3 for Adaptive Loss-aware Quantization for Multi-bit Networks

Figure 4 for Adaptive Loss-aware Quantization for Multi-bit Networks

Share this with someone who'll enjoy it:

Abstract:We investigate the compression of deep neural networks by quantizing their weights and activations into multiple binary bases, known as multi-bit networks (MBNs), which accelerates the inference and reduces the storage for deployment on low-resource mobile and embedded platforms. We propose Adaptive Loss-aware Quantization (ALQ), a new MBN quantization pipeline that is able to achieve an average bitwidth below one bit without notable loss in inference accuracy. Unlike previous MBN quantization solutions that train a quantizer by minimizing the error to reconstruct full precision weights, ALQ directly minimizes the quantization-induced error on the loss function involving neither gradient approximation nor full precision calculations. ALQ also exploits strategies including adaptive bitwidth, smooth bitwidth reduction, and iterative trained quantization to allow a smaller network size without loss in accuracy. Experiment results on popular image datasets show that ALQ outperforms state-of-the-art compressed networks in terms of both storage and accuracy.

View paper on

Share this with someone who'll enjoy it:

Title:Adaptive Loss-aware Quantization for Multi-bit Networks

Paper and Code