Alert button
Picture for Congliang Chen

Congliang Chen

Alert button

Why Transformers Need Adam: A Hessian Perspective

Add code
Bookmark button
Alert button
Feb 26, 2024
Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo

Viaarxiv icon

Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz

Add code
Bookmark button
Alert button
Oct 23, 2023
Tao Sun, Congliang Chen, Peng Qiao, Li Shen, Xinwang Liu, Dongsheng Li

Figure 1 for Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Figure 2 for Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Figure 3 for Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Figure 4 for Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Viaarxiv icon

Adam Can Converge Without Any Modification on Update Rules

Add code
Bookmark button
Alert button
Aug 23, 2022
Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhi-Quan Luo

Figure 1 for Adam Can Converge Without Any Modification on Update Rules
Figure 2 for Adam Can Converge Without Any Modification on Update Rules
Figure 3 for Adam Can Converge Without Any Modification on Update Rules
Figure 4 for Adam Can Converge Without Any Modification on Update Rules
Viaarxiv icon

Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis

Add code
Bookmark button
Alert button
May 28, 2022
Congliang Chen, Li Shen, Wei Liu, Zhi-Quan Luo

Figure 1 for Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis
Figure 2 for Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis
Figure 3 for Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis
Figure 4 for Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis
Viaarxiv icon

Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration

Add code
Bookmark button
Alert button
Jan 14, 2021
Congliang Chen, Li Shen, Fangyu Zou, Wei Liu

Figure 1 for Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Figure 2 for Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Figure 3 for Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Figure 4 for Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Viaarxiv icon

Quantized Adam with Error Feedback

Add code
Bookmark button
Alert button
Apr 29, 2020
Congliang Chen, Li Shen, Haozhi Huang, Qi Wu, Wei Liu

Figure 1 for Quantized Adam with Error Feedback
Figure 2 for Quantized Adam with Error Feedback
Figure 3 for Quantized Adam with Error Feedback
Figure 4 for Quantized Adam with Error Feedback
Viaarxiv icon

Arbitrary Style Transfer with Deep Feature Reshuffle

Add code
Bookmark button
Alert button
Jun 20, 2018
Shuyang Gu, Congliang Chen, Jing Liao, Lu Yuan

Figure 1 for Arbitrary Style Transfer with Deep Feature Reshuffle
Figure 2 for Arbitrary Style Transfer with Deep Feature Reshuffle
Figure 3 for Arbitrary Style Transfer with Deep Feature Reshuffle
Figure 4 for Arbitrary Style Transfer with Deep Feature Reshuffle
Viaarxiv icon