Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Jul 31, 2020

Jun Shu, Yanwen Zhu, Qian Zhao, Deyu Meng, Zongben Xu

Figure 1 for Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Figure 2 for Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Figure 3 for Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Figure 4 for Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Share this with someone who'll enjoy it:

Abstract:The learning rate (LR) is one of the most important hyper-parameters in stochastic gradient descent (SGD) for deep neural networks (DNNs) training and generalization. However, current hand-designed LR schedules need to manually pre-specify schedule as well as its extra hyper-parameters, which limits its ability to adapt non-convex optimization problems due to the significant variation of training dynamic. To address this issue, we propose a model capable of adaptively learning LR schedule from data. We specifically design a meta-learner with explicit mapping formulation to parameterize LR schedules, which can adjust LR adaptively to comply with current training dynamic by leveraging the information from past training histories. Image and text classification benchmark experiments substantiate the capability of our method for achieving proper LR schedules compared with baseline methods. Moreover, we transfer the learned LR schedule to other various tasks, like different training batch sizes, epochs, datasets, network architectures, especially large scale ImageNet dataset, showing its stronger generalization capability than related methods. Finally, guided by a small set of clean validation set, we show our method can achieve better generalization error when training data is biased with corrupted noise than baseline methods.

* 21 pages

View paper on

Share this with someone who'll enjoy it:

Title:Meta-LR-Schedule-Net: Learned LR Schedules that Scale and Generalize

Paper and Code