Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Layer-wise Quantization for Quantized Optimistic Dual Averaging

May 20, 2025

Anh Duc Nguyen, Ilia Markov, Frank Zhengqing Wu, Ali Ramezani-Kebrya, Kimon Antonakopoulos, Dan Alistarh, Volkan Cevher

Figure 1 for Layer-wise Quantization for Quantized Optimistic Dual Averaging

Figure 2 for Layer-wise Quantization for Quantized Optimistic Dual Averaging

Figure 3 for Layer-wise Quantization for Quantized Optimistic Dual Averaging

Figure 4 for Layer-wise Quantization for Quantized Optimistic Dual Averaging

Share this with someone who'll enjoy it:

Abstract:Modern deep neural networks exhibit heterogeneity across numerous layers of various types such as residuals, multi-head attention, etc., due to varying structures (dimensions, activation functions, etc.), distinct representation characteristics, which impact predictions. We develop a general layer-wise quantization framework with tight variance and code-length bounds, adapting to the heterogeneities over the course of training. We then apply a new layer-wise quantization technique within distributed variational inequalities (VIs), proposing a novel Quantized Optimistic Dual Averaging (QODA) algorithm with adaptive learning rates, which achieves competitive convergence rates for monotone VIs. We empirically show that QODA achieves up to a $150\%$ speedup over the baselines in end-to-end training time for training Wasserstein GAN on $12+$ GPUs.

* Accepted at the International Conference on Machine Learning (ICML 2025)

View paper on

Share this with someone who'll enjoy it:

Title:Layer-wise Quantization for Quantized Optimistic Dual Averaging

Paper and Code