Picture for William Won

William Won

FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models

Add code
Jun 28, 2024
Figure 1 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 2 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 3 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 4 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Viaarxiv icon

TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Training

Add code
Apr 11, 2023
Viaarxiv icon

ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale

Add code
Mar 24, 2023
Viaarxiv icon

Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models

Add code
Oct 09, 2021
Figure 1 for Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models
Figure 2 for Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models
Figure 3 for Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models
Figure 4 for Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models
Viaarxiv icon

Exploring Multi-dimensional Hierarchical Network Topologies for Efficient Distributed Training of Trillion Parameter DL Models

Add code
Sep 24, 2021
Figure 1 for Exploring Multi-dimensional Hierarchical Network Topologies for Efficient Distributed Training of Trillion Parameter DL Models
Figure 2 for Exploring Multi-dimensional Hierarchical Network Topologies for Efficient Distributed Training of Trillion Parameter DL Models
Figure 3 for Exploring Multi-dimensional Hierarchical Network Topologies for Efficient Distributed Training of Trillion Parameter DL Models
Figure 4 for Exploring Multi-dimensional Hierarchical Network Topologies for Efficient Distributed Training of Trillion Parameter DL Models
Viaarxiv icon