Alert button
Picture for Youshan Miao

Youshan Miao

Alert button

Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search

Add code
Bookmark button
Alert button
Nov 26, 2023
Zhiqi Lin, Youshan Miao, Guanbin Xu, Cheng Li, Olli Saarikivi, Saeed Maleki, Fan Yang

Viaarxiv icon

Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training

Add code
Bookmark button
Alert button
May 31, 2023
Yijia Zhang, Yibo Han, Shijie Cao, Guohao Dai, Youshan Miao, Ting Cao, Fan Yang, Ningyi Xu

Figure 1 for Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training
Figure 2 for Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training
Figure 3 for Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training
Figure 4 for Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training
Viaarxiv icon

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction

Add code
Bookmark button
Alert button
Jan 21, 2023
Zhiqi Lin, Youshan Miao, Guodong Liu, Xiaoxiang Shi, Quanlu Zhang, Fan Yang, Saeed Maleki, Yi Zhu, Xu Cao, Cheng Li, Mao Yang, Lintao Zhang, Lidong Zhou

Figure 1 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 2 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 3 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 4 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Viaarxiv icon

Dense-to-Sparse Gate for Mixture-of-Experts

Add code
Bookmark button
Alert button
Dec 29, 2021
Xiaonan Nie, Shijie Cao, Xupeng Miao, Lingxiao Ma, Jilong Xue, Youshan Miao, Zichao Yang, Zhi Yang, Bin Cui

Figure 1 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 2 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 3 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 4 for Dense-to-Sparse Gate for Mixture-of-Experts
Viaarxiv icon

CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning

Add code
Bookmark button
Alert button
May 13, 2021
Abhinav Jangda, Jun Huang, Guodong Liu, Amir Hossein Nodehi Sabet, Saeed Maleki, Youshan Miao, Madanlal Musuvathi, Todd Mytkowicz, Olli Sarikivi

Figure 1 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 2 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 3 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 4 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Viaarxiv icon

CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner

Add code
Bookmark button
Alert button
Mar 14, 2021
Cheng Luo, Lei Qu, Youshan Miao, Peng Cheng, Yongqiang Xiong

Figure 1 for CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner
Figure 2 for CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner
Viaarxiv icon

Architectural Implications of Graph Neural Networks

Add code
Bookmark button
Alert button
Sep 02, 2020
Zhihui Zhang, Jingwen Leng, Lingxiao Ma, Youshan Miao, Chao Li, Minyi Guo

Figure 1 for Architectural Implications of Graph Neural Networks
Figure 2 for Architectural Implications of Graph Neural Networks
Figure 3 for Architectural Implications of Graph Neural Networks
Figure 4 for Architectural Implications of Graph Neural Networks
Viaarxiv icon

Towards Efficient Large-Scale Graph Neural Network Computing

Add code
Bookmark button
Alert button
Oct 19, 2018
Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, Yafei Dai

Figure 1 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 2 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 3 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 4 for Towards Efficient Large-Scale Graph Neural Network Computing
Viaarxiv icon

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA

Add code
Bookmark button
Alert button
May 22, 2018
Jilong Xue, Youshan Miao, Cheng Chen, Ming Wu, Lintao Zhang, Lidong Zhou

Figure 1 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 2 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 3 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 4 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Viaarxiv icon