Picture for Shuguang Wang

Shuguang Wang

Understanding Stragglers in Large Model Training Using What-if Analysis

Add code
May 09, 2025
Figure 1 for Understanding Stragglers in Large Model Training Using What-if Analysis
Figure 2 for Understanding Stragglers in Large Model Training Using What-if Analysis
Figure 3 for Understanding Stragglers in Large Model Training Using What-if Analysis
Figure 4 for Understanding Stragglers in Large Model Training Using What-if Analysis
Viaarxiv icon

Advancing TDFN: Precise Fixation Point Generation Using Reconstruction Differences

Add code
Jan 26, 2025
Viaarxiv icon

Task-Driven Fixation Network: An Efficient Architecture with Fixation Selection

Add code
Jan 02, 2025
Figure 1 for Task-Driven Fixation Network: An Efficient Architecture with Fixation Selection
Figure 2 for Task-Driven Fixation Network: An Efficient Architecture with Fixation Selection
Figure 3 for Task-Driven Fixation Network: An Efficient Architecture with Fixation Selection
Figure 4 for Task-Driven Fixation Network: An Efficient Architecture with Fixation Selection
Viaarxiv icon

Minder: Faulty Machine Detection for Large-scale Distributed Model Training

Add code
Nov 04, 2024
Figure 1 for Minder: Faulty Machine Detection for Large-scale Distributed Model Training
Figure 2 for Minder: Faulty Machine Detection for Large-scale Distributed Model Training
Figure 3 for Minder: Faulty Machine Detection for Large-scale Distributed Model Training
Figure 4 for Minder: Faulty Machine Detection for Large-scale Distributed Model Training
Viaarxiv icon