Alert button
Picture for Atish Agarwala

Atish Agarwala

Alert button

Gradient descent induces alignment between weights and the empirical NTK for deep non-linear networks

Add code
Bookmark button
Alert button
Feb 07, 2024
Daniel Beaglehole, Ioannis Mitliagkas, Atish Agarwala

Viaarxiv icon

Neglected Hessian component explains mysteries in Sharpness regularization

Add code
Bookmark button
Alert button
Jan 24, 2024
Yann N. Dauphin, Atish Agarwala, Hossein Mobahi

Viaarxiv icon

On the Interplay Between Stepsize Tuning and Progressive Sharpening

Add code
Bookmark button
Alert button
Dec 07, 2023
Vincent Roulet, Atish Agarwala, Fabian Pedregosa

Viaarxiv icon

SAM operates far from home: eigenvalue regularization as a dynamical phenomenon

Add code
Bookmark button
Alert button
Feb 17, 2023
Atish Agarwala, Yann N. Dauphin

Figure 1 for SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Figure 2 for SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Figure 3 for SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Figure 4 for SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Viaarxiv icon

Second-order regression models exhibit progressive sharpening to the edge of stability

Add code
Bookmark button
Alert button
Oct 10, 2022
Atish Agarwala, Fabian Pedregosa, Jeffrey Pennington

Figure 1 for Second-order regression models exhibit progressive sharpening to the edge of stability
Figure 2 for Second-order regression models exhibit progressive sharpening to the edge of stability
Figure 3 for Second-order regression models exhibit progressive sharpening to the edge of stability
Figure 4 for Second-order regression models exhibit progressive sharpening to the edge of stability
Viaarxiv icon

Deep equilibrium networks are sensitive to initialization statistics

Add code
Bookmark button
Alert button
Jul 19, 2022
Atish Agarwala, Samuel S. Schoenholz

Figure 1 for Deep equilibrium networks are sensitive to initialization statistics
Figure 2 for Deep equilibrium networks are sensitive to initialization statistics
Figure 3 for Deep equilibrium networks are sensitive to initialization statistics
Figure 4 for Deep equilibrium networks are sensitive to initialization statistics
Viaarxiv icon

One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks

Add code
Bookmark button
Alert button
Mar 29, 2021
Atish Agarwala, Abhimanyu Das, Brendan Juba, Rina Panigrahy, Vatsal Sharan, Xin Wang, Qiuyi Zhang

Figure 1 for One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Figure 2 for One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Figure 3 for One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Figure 4 for One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Viaarxiv icon

Temperature check: theory and practice for training models with softmax-cross-entropy losses

Add code
Bookmark button
Alert button
Oct 14, 2020
Atish Agarwala, Jeffrey Pennington, Yann Dauphin, Sam Schoenholz

Figure 1 for Temperature check: theory and practice for training models with softmax-cross-entropy losses
Figure 2 for Temperature check: theory and practice for training models with softmax-cross-entropy losses
Figure 3 for Temperature check: theory and practice for training models with softmax-cross-entropy losses
Figure 4 for Temperature check: theory and practice for training models with softmax-cross-entropy losses
Viaarxiv icon

Learning the gravitational force law and other analytic functions

Add code
Bookmark button
Alert button
May 15, 2020
Atish Agarwala, Abhimanyu Das, Rina Panigrahy, Qiuyi Zhang

Figure 1 for Learning the gravitational force law and other analytic functions
Viaarxiv icon