Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiyu Shi†

Quantization of Deep Neural Networks for Accurate EdgeComputing

Apr 25, 2021

Wentao Chen, Hailong Qiu, Jian Zhuang, Chutong Zhang, Yu Hu, Qing Lu, Tianchen Wang, Yiyu Shi†, Meiping Huang, Xiaowe Xu

Figure 1 for Quantization of Deep Neural Networks for Accurate EdgeComputing

Figure 2 for Quantization of Deep Neural Networks for Accurate EdgeComputing

Figure 3 for Quantization of Deep Neural Networks for Accurate EdgeComputing

Figure 4 for Quantization of Deep Neural Networks for Accurate EdgeComputing

Abstract:Deep neural networks (DNNs) have demonstrated their great potential in recent years, exceeding the per-formance of human experts in a wide range of applications. Due to their large sizes, however, compressiontechniques such as weight quantization and pruning are usually applied before they can be accommodated onthe edge. It is generally believed that quantization leads to performance degradation, and plenty of existingworks have explored quantization strategies aiming at minimum accuracy loss. In this paper, we argue thatquantization, which essentially imposes regularization on weight representations, can sometimes help toimprove accuracy. We conduct comprehensive experiments on three widely used applications: fully con-nected network (FCN) for biomedical image segmentation, convolutional neural network (CNN) for imageclassification on ImageNet, and recurrent neural network (RNN) for automatic speech recognition, and experi-mental results show that quantization can improve the accuracy by 1%, 1.95%, 4.23% on the three applicationsrespectively with 3.5x-6.4x memory reduction.

* 11 pages, 3 figures, 10 tables, accepted by the ACM Journal on Emerging Technologies in Computing Systems (JETC)

Via

Access Paper or Ask Questions