Alert button
Picture for Shaoduo Gan

Shaoduo Gan

Alert button

SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget

Add code
Bookmark button
Alert button
Apr 07, 2024
Zihao Wang, Shaoduo Gan

Viaarxiv icon

Few-shot Named Entity Recognition with Entity-level Prototypical Network Enhanced by Dispersedly Distributed Prototypes

Add code
Bookmark button
Alert button
Aug 17, 2022
Bin Ji, Shasha Li, Shaoduo Gan, Jie Yu, Jun Ma, Huijun Liu

Figure 1 for Few-shot Named Entity Recognition with Entity-level Prototypical Network Enhanced by Dispersedly Distributed Prototypes
Figure 2 for Few-shot Named Entity Recognition with Entity-level Prototypical Network Enhanced by Dispersedly Distributed Prototypes
Figure 3 for Few-shot Named Entity Recognition with Entity-level Prototypical Network Enhanced by Dispersedly Distributed Prototypes
Figure 4 for Few-shot Named Entity Recognition with Entity-level Prototypical Network Enhanced by Dispersedly Distributed Prototypes
Viaarxiv icon

Stochastic Gradient Descent without Full Data Shuffle

Add code
Bookmark button
Alert button
Jun 12, 2022
Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, Ce Zhang

Figure 1 for Stochastic Gradient Descent without Full Data Shuffle
Figure 2 for Stochastic Gradient Descent without Full Data Shuffle
Figure 3 for Stochastic Gradient Descent without Full Data Shuffle
Figure 4 for Stochastic Gradient Descent without Full Data Shuffle
Viaarxiv icon

FRuDA: Framework for Distributed Adversarial Domain Adaptation

Add code
Bookmark button
Alert button
Dec 26, 2021
Shaoduo Gan, Akhil Mathur, Anton Isopoussu, Fahim Kawsar, Nadia Berthouze, Nicholas Lane

Figure 1 for FRuDA: Framework for Distributed Adversarial Domain Adaptation
Figure 2 for FRuDA: Framework for Distributed Adversarial Domain Adaptation
Figure 3 for FRuDA: Framework for Distributed Adversarial Domain Adaptation
Figure 4 for FRuDA: Framework for Distributed Adversarial Domain Adaptation
Viaarxiv icon

BAGUA: Scaling up Distributed Learning with System Relaxations

Add code
Bookmark button
Alert button
Jul 12, 2021
Shaoduo Gan, Xiangru Lian, Rui Wang, Jianbin Chang, Chengjun Liu, Hongmei Shi, Shengzhuo Zhang, Xianghong Li, Tengxu Sun, Jiawei Jiang, Binhang Yuan, Sen Yang, Ji Liu, Ce Zhang

Figure 1 for BAGUA: Scaling up Distributed Learning with System Relaxations
Figure 2 for BAGUA: Scaling up Distributed Learning with System Relaxations
Figure 3 for BAGUA: Scaling up Distributed Learning with System Relaxations
Figure 4 for BAGUA: Scaling up Distributed Learning with System Relaxations
Viaarxiv icon

Towards Demystifying Serverless Machine Learning Training

Add code
Bookmark button
Alert button
May 17, 2021
Jiawei Jiang, Shaoduo Gan, Yue Liu, Fanlin Wang, Gustavo Alonso, Ana Klimovic, Ankit Singla, Wentao Wu, Ce Zhang

Figure 1 for Towards Demystifying Serverless Machine Learning Training
Figure 2 for Towards Demystifying Serverless Machine Learning Training
Figure 3 for Towards Demystifying Serverless Machine Learning Training
Figure 4 for Towards Demystifying Serverless Machine Learning Training
Viaarxiv icon

1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed

Add code
Bookmark button
Alert button
Feb 04, 2021
Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He

Figure 1 for 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Figure 2 for 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Figure 3 for 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Figure 4 for 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Viaarxiv icon

APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm

Add code
Bookmark button
Alert button
Aug 28, 2020
Hanlin Tang, Shaoduo Gan, Samyam Rajbhandari, Xiangru Lian, Ji Liu, Yuxiong He, Ce Zhang

Figure 1 for APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm
Figure 2 for APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm
Figure 3 for APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm
Figure 4 for APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm
Viaarxiv icon