Alert button
Picture for Shenggui Li

Shenggui Li

Alert button

GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding

Add code
Bookmark button
Alert button
Feb 03, 2024
Cunxiao Du, Jing Jiang, Xu Yuanchen, Jiawei Wu, Sicheng Yu, Yongqi Li, Shenggui Li, Kai Xu, Liqiang Nie, Zhaopeng Tu, Yang You

Viaarxiv icon

Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models

Add code
Bookmark button
Alert button
Feb 22, 2023
Yuliang Liu, Shenggui Li, Jiarui Fang, Yanjun Shao, Boyuan Yao, Yang You

Figure 1 for Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Figure 2 for Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Figure 3 for Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Figure 4 for Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Viaarxiv icon

MAP: Memory-aware Automated Intra-op Parallel Training For Foundation Models

Add code
Bookmark button
Alert button
Feb 06, 2023
Yuliang Liu, Shenggui Li, Jiarui Fang, Yanjun Shao, Boyuan Yao, Yang You

Figure 1 for MAP: Memory-aware Automated Intra-op Parallel Training For Foundation Models
Figure 2 for MAP: Memory-aware Automated Intra-op Parallel Training For Foundation Models
Figure 3 for MAP: Memory-aware Automated Intra-op Parallel Training For Foundation Models
Figure 4 for MAP: Memory-aware Automated Intra-op Parallel Training For Foundation Models
Viaarxiv icon

Elixir: Train a Large Language Model on a Small GPU Cluster

Add code
Bookmark button
Alert button
Dec 10, 2022
Haichen Huang, Jiarui Fang, Hongxin Liu, Shenggui Li, Yang You

Figure 1 for Elixir: Train a Large Language Model on a Small GPU Cluster
Figure 2 for Elixir: Train a Large Language Model on a Small GPU Cluster
Figure 3 for Elixir: Train a Large Language Model on a Small GPU Cluster
Figure 4 for Elixir: Train a Large Language Model on a Small GPU Cluster
Viaarxiv icon

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models

Add code
Bookmark button
Alert button
Sep 06, 2022
Jiangsu Du, Ziming Liu, Jiarui Fang, Shenggui Li, Yongbin Li, Yutong Lu, Yang You

Figure 1 for EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Figure 2 for EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Figure 3 for EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Figure 4 for EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models
Viaarxiv icon

A Frequency-aware Software Cache for Large Recommendation System Embeddings

Add code
Bookmark button
Alert button
Aug 08, 2022
Jiarui Fang, Geng Zhang, Jiatong Han, Shenggui Li, Zhengda Bian, Yongbin Li, Jin Liu, Yang You

Figure 1 for A Frequency-aware Software Cache for Large Recommendation System Embeddings
Figure 2 for A Frequency-aware Software Cache for Large Recommendation System Embeddings
Figure 3 for A Frequency-aware Software Cache for Large Recommendation System Embeddings
Figure 4 for A Frequency-aware Software Cache for Large Recommendation System Embeddings
Viaarxiv icon

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

Add code
Bookmark button
Alert button
Feb 24, 2022
Jie Zhu, Shenggui Li, Yang You

Figure 1 for Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Figure 2 for Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Figure 3 for Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Figure 4 for Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Viaarxiv icon

PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management

Add code
Bookmark button
Alert button
Aug 12, 2021
Jiarui Fang, Yang Yu, Shenggui Li, Yang You, Jie Zhou

Figure 1 for PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management
Figure 2 for PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management
Figure 3 for PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management
Figure 4 for PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management
Viaarxiv icon

Online Evolutionary Batch Size Orchestration for Scheduling Deep Learning Workloads in GPU Clusters

Add code
Bookmark button
Alert button
Aug 08, 2021
Zhengda Bian, Shenggui Li, Wei Wang, Yang You

Figure 1 for Online Evolutionary Batch Size Orchestration for Scheduling Deep Learning Workloads in GPU Clusters
Figure 2 for Online Evolutionary Batch Size Orchestration for Scheduling Deep Learning Workloads in GPU Clusters
Figure 3 for Online Evolutionary Batch Size Orchestration for Scheduling Deep Learning Workloads in GPU Clusters
Figure 4 for Online Evolutionary Batch Size Orchestration for Scheduling Deep Learning Workloads in GPU Clusters
Viaarxiv icon

Sequence Parallelism: Making 4D Parallelism Possible

Add code
Bookmark button
Alert button
May 26, 2021
Shenggui Li, Fuzhao Xue, Yongbin Li, Yang You

Figure 1 for Sequence Parallelism: Making 4D Parallelism Possible
Figure 2 for Sequence Parallelism: Making 4D Parallelism Possible
Figure 3 for Sequence Parallelism: Making 4D Parallelism Possible
Figure 4 for Sequence Parallelism: Making 4D Parallelism Possible
Viaarxiv icon