Alert button
Picture for Congliang Chen

Congliang Chen

Alert button

Why Transformers Need Adam: A Hessian Perspective

Feb 26, 2024
Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo

Viaarxiv icon

Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz

Oct 23, 2023
Tao Sun, Congliang Chen, Peng Qiao, Li Shen, Xinwang Liu, Dongsheng Li

Figure 1 for Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Figure 2 for Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Figure 3 for Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Figure 4 for Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Viaarxiv icon

Adam Can Converge Without Any Modification on Update Rules

Aug 23, 2022
Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhi-Quan Luo

Figure 1 for Adam Can Converge Without Any Modification on Update Rules
Figure 2 for Adam Can Converge Without Any Modification on Update Rules
Figure 3 for Adam Can Converge Without Any Modification on Update Rules
Figure 4 for Adam Can Converge Without Any Modification on Update Rules
Viaarxiv icon

Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis

May 28, 2022
Congliang Chen, Li Shen, Wei Liu, Zhi-Quan Luo

Figure 1 for Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis
Figure 2 for Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis
Figure 3 for Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis
Figure 4 for Efficient-Adam: Communication-Efficient Distributed Adam with Complexity Analysis
Viaarxiv icon

Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration

Jan 14, 2021
Congliang Chen, Li Shen, Fangyu Zou, Wei Liu

Figure 1 for Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Figure 2 for Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Figure 3 for Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Figure 4 for Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Viaarxiv icon

Quantized Adam with Error Feedback

Apr 29, 2020
Congliang Chen, Li Shen, Haozhi Huang, Qi Wu, Wei Liu

Figure 1 for Quantized Adam with Error Feedback
Figure 2 for Quantized Adam with Error Feedback
Figure 3 for Quantized Adam with Error Feedback
Figure 4 for Quantized Adam with Error Feedback
Viaarxiv icon

Arbitrary Style Transfer with Deep Feature Reshuffle

Jun 20, 2018
Shuyang Gu, Congliang Chen, Jing Liao, Lu Yuan

Figure 1 for Arbitrary Style Transfer with Deep Feature Reshuffle
Figure 2 for Arbitrary Style Transfer with Deep Feature Reshuffle
Figure 3 for Arbitrary Style Transfer with Deep Feature Reshuffle
Figure 4 for Arbitrary Style Transfer with Deep Feature Reshuffle
Viaarxiv icon