Alert button

Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization

Aug 25, 2022
Zhengyi Li, Cong Guo, Zhanda Zhu, Yangjie Zhou, Yuxian Qiu, Xiaotian Gao, Jingwen Leng, Minyi Guo

Figure 1 for Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization
Figure 2 for Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization
Figure 3 for Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization
Figure 4 for Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization

Share this with someone who'll enjoy it:

Post-training quantization (PTQ) attracts increasing attention due to its convenience in deploying quantized neural networks. Rounding, the primary source of quantization error, is optimized only for model weights, while activations still use the rounding-to-nearest operation. In this work, for the first time, we demonstrate that well-chosen rounding schemes for activations can improve the final accuracy. To deal with the challenge of the dynamicity of the activation rounding scheme, we adaptively adjust the rounding border through a simple function to generate rounding schemes at the inference stage. The border function covers the impact of weight errors, activation errors, and propagated errors to eliminate the bias of the element-wise error, which further benefits model accuracy. We also make the border aware of global errors to better fit different arriving activations. Finally, we propose the AQuant framework to learn the border function. Extensive experiments show that AQuant achieves noticeable improvements with negligible overhead compared with state-of-the-art works and pushes the accuracy of ResNet-18 up to 60.3\% under the 2-bit weight and activation post-training quantization.

View paper onarxiv icon

Share this with someone who'll enjoy it: