Alert button
Picture for Zhiling Ye

Zhiling Ye

Alert button

Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term

May 25, 2023
Yun Yue, Jiadi Jiang, Zhiling Ye, Ning Gao, Yongchao Liu, Ke Zhang

Figure 1 for Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term
Figure 2 for Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term
Figure 3 for Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term
Figure 4 for Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term

Deep Neural Networks (DNNs) generalization is known to be closely related to the flatness of minima, leading to the development of Sharpness-Aware Minimization (SAM) for seeking flatter minima and better generalization. In this paper, we revisit the loss of SAM and propose a more general method, called WSAM, by incorporating sharpness as a regularization term. We prove its generalization bound through the combination of PAC and Bayes-PAC techniques, and evaluate its performance on various public datasets. The results demonstrate that WSAM achieves improved generalization, or is at least highly competitive, compared to the vanilla optimizer, SAM and its variants. The code is available at https://github.com/intelligent-machine-learning/dlrover/tree/master/atorch/atorch/optimizers.

* 10 pages. Accepted as a conference paper at KDD '23 
Viaarxiv icon

MFAGAN: A Compression Framework for Memory-Efficient On-Device Super-Resolution GAN

Jul 27, 2021
Wenlong Cheng, Mingbo Zhao, Zhiling Ye, Shuhang Gu

Figure 1 for MFAGAN: A Compression Framework for Memory-Efficient On-Device Super-Resolution GAN
Figure 2 for MFAGAN: A Compression Framework for Memory-Efficient On-Device Super-Resolution GAN
Figure 3 for MFAGAN: A Compression Framework for Memory-Efficient On-Device Super-Resolution GAN
Figure 4 for MFAGAN: A Compression Framework for Memory-Efficient On-Device Super-Resolution GAN

Generative adversarial networks (GANs) have promoted remarkable advances in single-image super-resolution (SR) by recovering photo-realistic images. However, high memory consumption of GAN-based SR (usually generators) causes performance degradation and more energy consumption, hindering the deployment of GAN-based SR into resource-constricted mobile devices. In this paper, we propose a novel compression framework \textbf{M}ulti-scale \textbf{F}eature \textbf{A}ggregation Net based \textbf{GAN} (MFAGAN) for reducing the memory access cost of the generator. First, to overcome the memory explosion of dense connections, we utilize a memory-efficient multi-scale feature aggregation net as the generator. Second, for faster and more stable training, our method introduces the PatchGAN discriminator. Third, to balance the student discriminator and the compressed generator, we distill both the generator and the discriminator. Finally, we perform a hardware-aware neural architecture search (NAS) to find a specialized SubGenerator for the target mobile phone. Benefiting from these improvements, the proposed MFAGAN achieves up to \textbf{8.3}$\times$ memory saving and \textbf{42.9}$\times$ computation reduction, with only minor visual quality degradation, compared with ESRGAN. Empirical studies also show $\sim$\textbf{70} milliseconds latency on Qualcomm Snapdragon 865 chipset.

Viaarxiv icon