Alert button
Picture for Aditya Varre

Aditya Varre

Alert button

Why Do We Need Weight Decay in Modern Deep Learning?

Add code
Bookmark button
Alert button
Oct 06, 2023
Maksym Andriushchenko, Francesco D'Angelo, Aditya Varre, Nicolas Flammarion

Figure 1 for Why Do We Need Weight Decay in Modern Deep Learning?
Figure 2 for Why Do We Need Weight Decay in Modern Deep Learning?
Figure 3 for Why Do We Need Weight Decay in Modern Deep Learning?
Figure 4 for Why Do We Need Weight Decay in Modern Deep Learning?
Viaarxiv icon

SGD with large step sizes learns sparse features

Add code
Bookmark button
Alert button
Oct 11, 2022
Maksym Andriushchenko, Aditya Varre, Loucas Pillaud-Vivien, Nicolas Flammarion

Figure 1 for SGD with large step sizes learns sparse features
Figure 2 for SGD with large step sizes learns sparse features
Figure 3 for SGD with large step sizes learns sparse features
Figure 4 for SGD with large step sizes learns sparse features
Viaarxiv icon

Accelerated SGD for Non-Strongly-Convex Least Squares

Add code
Bookmark button
Alert button
Mar 03, 2022
Aditya Varre, Nicolas Flammarion

Figure 1 for Accelerated SGD for Non-Strongly-Convex Least Squares
Figure 2 for Accelerated SGD for Non-Strongly-Convex Least Squares
Viaarxiv icon

Last iterate convergence of SGD for Least-Squares in the Interpolation regime

Add code
Bookmark button
Alert button
Feb 05, 2021
Aditya Varre, Loucas Pillaud-Vivien, Nicolas Flammarion

Figure 1 for Last iterate convergence of SGD for Least-Squares in the Interpolation regime
Figure 2 for Last iterate convergence of SGD for Least-Squares in the Interpolation regime
Figure 3 for Last iterate convergence of SGD for Least-Squares in the Interpolation regime
Figure 4 for Last iterate convergence of SGD for Least-Squares in the Interpolation regime
Viaarxiv icon