Picture for Youshan Miao

Youshan Miao

Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search

Add code
Nov 26, 2023
Figure 1 for Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search
Figure 2 for Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search
Figure 3 for Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search
Figure 4 for Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search
Viaarxiv icon

Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training

Add code
May 31, 2023
Viaarxiv icon

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction

Add code
Jan 21, 2023
Figure 1 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 2 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 3 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 4 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Viaarxiv icon

Dense-to-Sparse Gate for Mixture-of-Experts

Add code
Dec 29, 2021
Figure 1 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 2 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 3 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 4 for Dense-to-Sparse Gate for Mixture-of-Experts
Viaarxiv icon

CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning

Add code
May 13, 2021
Figure 1 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 2 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 3 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 4 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Viaarxiv icon

CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner

Add code
Mar 14, 2021
Figure 1 for CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner
Figure 2 for CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner
Viaarxiv icon

Architectural Implications of Graph Neural Networks

Add code
Sep 02, 2020
Figure 1 for Architectural Implications of Graph Neural Networks
Figure 2 for Architectural Implications of Graph Neural Networks
Figure 3 for Architectural Implications of Graph Neural Networks
Figure 4 for Architectural Implications of Graph Neural Networks
Viaarxiv icon

Towards Efficient Large-Scale Graph Neural Network Computing

Add code
Oct 19, 2018
Figure 1 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 2 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 3 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 4 for Towards Efficient Large-Scale Graph Neural Network Computing
Viaarxiv icon

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA

Add code
May 22, 2018
Figure 1 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 2 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 3 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 4 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Viaarxiv icon