Picture for Xipeng Li

Xipeng Li

MoE Parallel Folding: Heterogeneous Parallelism Mappings for Efficient Large-Scale MoE Model Training with Megatron Core

Add code
Apr 21, 2025
Viaarxiv icon

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity

Add code
Aug 29, 2020
Figure 1 for Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Figure 2 for Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Figure 3 for Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Figure 4 for Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Viaarxiv icon