Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexander Conzelmann

Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding

May 24, 2025

Alexander Conzelmann, Robert Bamler

Figure 1 for Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding

Figure 2 for Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding

Figure 3 for Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding

Figure 4 for Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding

Abstract:The ever-growing size of neural networks poses serious challenges on resource-constrained devices, such as embedded sensors. Compression algorithms that reduce their size can mitigate these problems, provided that model performance stays close to the original. We propose a novel post-training compression framework that combines rate-aware quantization with entropy coding by (1) extending the well-known layer-wise loss by a quadratic rate estimation, and (2) providing locally exact solutions to this modified objective following the Optimal Brain Surgeon (OBS) method. Our method allows for very fast decoding and is compatible with arbitrary quantization grids. We verify our results empirically by testing on various computer-vision networks, achieving a 20-40\% decrease in bit rate at the same performance as the popular compression algorithm NNCodec. Our code is available at https://github.com/Conzel/cerwu.

* 9 pages + 5 pages of appendix

Via

Access Paper or Ask Questions

Decentralized Task Offloading and Load-Balancing for Mobile Edge Computing in Dense Networks

Jun 24, 2024

Mariam Yahya, Alexander Conzelmann, Setareh Maghsudi

Abstract:We study the problem of decentralized task offloading and load-balancing in a dense network with numerous devices and a set of edge servers. Solving this problem optimally is complicated due to the unknown network information and random task sizes. The shared network resources also influence the users' decisions and resource distribution. Our solution combines the mean field multi-agent multi-armed bandit (MAB) game with a load-balancing technique that adjusts the servers' rewards to achieve a target population profile despite the distributed user decision-making. Numerical results demonstrate the efficacy of our approach and the convergence to the target load distribution.

Via

Access Paper or Ask Questions