Picture for Atish Agarwala

Atish Agarwala

Stepping on the Edge: Curvature Aware Learning Rate Tuners

Add code
Jul 08, 2024
Viaarxiv icon

A Clipped Trip: the Dynamics of SGD with Gradient Clipping in High-Dimensions

Add code
Jun 17, 2024
Viaarxiv icon

High dimensional analysis reveals conservative sharpening and a stochastic edge of stability

Add code
Apr 30, 2024
Viaarxiv icon

Gradient descent induces alignment between weights and the empirical NTK for deep non-linear networks

Add code
Feb 07, 2024
Viaarxiv icon

Neglected Hessian component explains mysteries in Sharpness regularization

Add code
Jan 24, 2024
Viaarxiv icon

On the Interplay Between Stepsize Tuning and Progressive Sharpening

Add code
Dec 07, 2023
Viaarxiv icon

SAM operates far from home: eigenvalue regularization as a dynamical phenomenon

Add code
Feb 17, 2023
Figure 1 for SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Figure 2 for SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Figure 3 for SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Figure 4 for SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Viaarxiv icon

Second-order regression models exhibit progressive sharpening to the edge of stability

Add code
Oct 10, 2022
Figure 1 for Second-order regression models exhibit progressive sharpening to the edge of stability
Figure 2 for Second-order regression models exhibit progressive sharpening to the edge of stability
Figure 3 for Second-order regression models exhibit progressive sharpening to the edge of stability
Figure 4 for Second-order regression models exhibit progressive sharpening to the edge of stability
Viaarxiv icon

Deep equilibrium networks are sensitive to initialization statistics

Add code
Jul 19, 2022
Figure 1 for Deep equilibrium networks are sensitive to initialization statistics
Figure 2 for Deep equilibrium networks are sensitive to initialization statistics
Figure 3 for Deep equilibrium networks are sensitive to initialization statistics
Figure 4 for Deep equilibrium networks are sensitive to initialization statistics
Viaarxiv icon

One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks

Add code
Mar 29, 2021
Figure 1 for One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Figure 2 for One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Figure 3 for One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Figure 4 for One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Viaarxiv icon