Alert button
Picture for Saeed Maleki

Saeed Maleki

Alert button

Microsoft Research

ForestColl: Efficient Collective Communications on Heterogeneous Network Fabrics

Feb 09, 2024
Liangyu Zhao, Saeed Maleki, Ziyue Yang, Hossein Pourreza, Aashaka Shah, Changho Hwang, Arvind Krishnamurthy

Viaarxiv icon

Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search

Nov 26, 2023
Zhiqi Lin, Youshan Miao, Guanbin Xu, Cheng Li, Olli Saarikivi, Saeed Maleki, Fan Yang

Viaarxiv icon

Look-Up mAI GeMM: Increasing AI GeMMs Performance by Nearly 2.5x via msGeMM

Oct 09, 2023
Saeed Maleki

Viaarxiv icon

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction

Jan 21, 2023
Zhiqi Lin, Youshan Miao, Guodong Liu, Xiaoxiang Shi, Quanlu Zhang, Fan Yang, Saeed Maleki, Yi Zhu, Xu Cao, Cheng Li, Mao Yang, Lintao Zhang, Lidong Zhou

Figure 1 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 2 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 3 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Figure 4 for SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction
Viaarxiv icon

Synthesizing Collective Communication Algorithms for Heterogeneous Networks with TACCL

Nov 15, 2021
Aashaka Shah, Vijay Chidambaram, Meghan Cowan, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Jacob Nelson, Olli Saarikivi, Rachee Singh

Figure 1 for Synthesizing Collective Communication Algorithms for Heterogeneous Networks with TACCL
Figure 2 for Synthesizing Collective Communication Algorithms for Heterogeneous Networks with TACCL
Figure 3 for Synthesizing Collective Communication Algorithms for Heterogeneous Networks with TACCL
Figure 4 for Synthesizing Collective Communication Algorithms for Heterogeneous Networks with TACCL
Viaarxiv icon

Total Least Squares for Optimal Pose Estimation

Jun 22, 2021
Saeed Maleki, John Crassidis, Yang Cheng, Matthias Schmid

Figure 1 for Total Least Squares for Optimal Pose Estimation
Figure 2 for Total Least Squares for Optimal Pose Estimation
Figure 3 for Total Least Squares for Optimal Pose Estimation
Figure 4 for Total Least Squares for Optimal Pose Estimation
Viaarxiv icon

CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning

May 13, 2021
Abhinav Jangda, Jun Huang, Guodong Liu, Amir Hossein Nodehi Sabet, Saeed Maleki, Youshan Miao, Madanlal Musuvathi, Todd Mytkowicz, Olli Sarikivi

Figure 1 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 2 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 3 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Figure 4 for CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning
Viaarxiv icon

Scaling Distributed Training with Adaptive Summation

Jun 04, 2020
Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Olli Saarikivi, Tianju Xu, Vadim Eksarevskiy, Jaliya Ekanayake, Emad Barsoum

Figure 1 for Scaling Distributed Training with Adaptive Summation
Figure 2 for Scaling Distributed Training with Adaptive Summation
Figure 3 for Scaling Distributed Training with Adaptive Summation
Figure 4 for Scaling Distributed Training with Adaptive Summation
Viaarxiv icon

Distributed Word2Vec using Graph Analytics Frameworks

Sep 08, 2019
Gurbinder Gill, Roshan Dathathri, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Olli Saarikivi

Figure 1 for Distributed Word2Vec using Graph Analytics Frameworks
Figure 2 for Distributed Word2Vec using Graph Analytics Frameworks
Figure 3 for Distributed Word2Vec using Graph Analytics Frameworks
Figure 4 for Distributed Word2Vec using Graph Analytics Frameworks
Viaarxiv icon