Alert button
Picture for Kwangjun Ahn

Kwangjun Ahn

Alert button

Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise

Add code
Bookmark button
Alert button
Feb 02, 2024
Kwangjun Ahn, Zhiyu Zhang, Yunbum Kook, Yan Dai

Viaarxiv icon

Linear attention is (maybe) all you need (to understand transformer optimization)

Add code
Bookmark button
Alert button
Oct 02, 2023
Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra

Figure 1 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 2 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 3 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 4 for Linear attention is (maybe) all you need (to understand transformer optimization)
Viaarxiv icon

A Unified Approach to Controlling Implicit Regularization via Mirror Descent

Add code
Bookmark button
Alert button
Jun 24, 2023
Haoyuan Sun, Khashayar Gatmiry, Kwangjun Ahn, Navid Azizan

Figure 1 for A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Figure 2 for A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Figure 3 for A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Figure 4 for A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Viaarxiv icon

Smooth Model Predictive Control with Applications to Statistical Learning

Add code
Bookmark button
Alert button
Jun 02, 2023
Kwangjun Ahn, Daniel Pfrommer, Jack Umenberger, Tobia Marcucci, Zak Mhammedi, Ali Jadbabaie

Figure 1 for Smooth Model Predictive Control with Applications to Statistical Learning
Viaarxiv icon

Transformers learn to implement preconditioned gradient descent for in-context learning

Add code
Bookmark button
Alert button
Jun 01, 2023
Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra

Figure 1 for Transformers learn to implement preconditioned gradient descent for in-context learning
Figure 2 for Transformers learn to implement preconditioned gradient descent for in-context learning
Figure 3 for Transformers learn to implement preconditioned gradient descent for in-context learning
Figure 4 for Transformers learn to implement preconditioned gradient descent for in-context learning
Viaarxiv icon

How to escape sharp minima

Add code
Bookmark button
Alert button
May 25, 2023
Kwangjun Ahn, Ali Jadbabaie, Suvrit Sra

Figure 1 for How to escape sharp minima
Figure 2 for How to escape sharp minima
Viaarxiv icon

The Crucial Role of Normalization in Sharpness-Aware Minimization

Add code
Bookmark button
Alert button
May 24, 2023
Yan Dai, Kwangjun Ahn, Suvrit Sra

Figure 1 for The Crucial Role of Normalization in Sharpness-Aware Minimization
Figure 2 for The Crucial Role of Normalization in Sharpness-Aware Minimization
Figure 3 for The Crucial Role of Normalization in Sharpness-Aware Minimization
Figure 4 for The Crucial Role of Normalization in Sharpness-Aware Minimization
Viaarxiv icon

Learning threshold neurons via the "edge of stability"

Add code
Bookmark button
Alert button
Dec 14, 2022
Kwangjun Ahn, Sébastien Bubeck, Sinho Chewi, Yin Tat Lee, Felipe Suarez, Yi Zhang

Figure 1 for Learning threshold neurons via the "edge of stability"
Figure 2 for Learning threshold neurons via the "edge of stability"
Figure 3 for Learning threshold neurons via the "edge of stability"
Figure 4 for Learning threshold neurons via the "edge of stability"
Viaarxiv icon

Model Predictive Control via On-Policy Imitation Learning

Add code
Bookmark button
Alert button
Oct 17, 2022
Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie

Figure 1 for Model Predictive Control via On-Policy Imitation Learning
Figure 2 for Model Predictive Control via On-Policy Imitation Learning
Figure 3 for Model Predictive Control via On-Policy Imitation Learning
Viaarxiv icon

One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares

Add code
Bookmark button
Alert button
Jul 28, 2022
Youngjae Min, Kwangjun Ahn, Navid Azizan

Figure 1 for One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares
Figure 2 for One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares
Figure 3 for One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares
Viaarxiv icon