Picture for Yanghua Peng

Yanghua Peng

ByteDance

MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production

Add code
May 19, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Viaarxiv icon

OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training

Add code
Apr 14, 2025
Viaarxiv icon

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Add code
Apr 11, 2025
Viaarxiv icon

Goku: Flow Based Video Generative Foundation Models

Add code
Feb 10, 2025
Viaarxiv icon

HybridFlow: A Flexible and Efficient RLHF Framework

Add code
Sep 28, 2024
Figure 1 for HybridFlow: A Flexible and Efficient RLHF Framework
Figure 2 for HybridFlow: A Flexible and Efficient RLHF Framework
Figure 3 for HybridFlow: A Flexible and Efficient RLHF Framework
Figure 4 for HybridFlow: A Flexible and Efficient RLHF Framework
Viaarxiv icon

Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

Add code
Aug 07, 2024
Figure 1 for Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation
Figure 2 for Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation
Figure 3 for Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation
Figure 4 for Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation
Viaarxiv icon

ByteCheckpoint: A Unified Checkpointing System for LLM Development

Add code
Jul 29, 2024
Figure 1 for ByteCheckpoint: A Unified Checkpointing System for LLM Development
Figure 2 for ByteCheckpoint: A Unified Checkpointing System for LLM Development
Figure 3 for ByteCheckpoint: A Unified Checkpointing System for LLM Development
Figure 4 for ByteCheckpoint: A Unified Checkpointing System for LLM Development
Viaarxiv icon

QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices

Add code
Jul 02, 2024
Figure 1 for QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Figure 2 for QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Figure 3 for QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Figure 4 for QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Viaarxiv icon

LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization

Add code
Mar 02, 2024
Viaarxiv icon