Picture for Jiamang Wang

Jiamang Wang

DDK: Distilling Domain Knowledge for Efficient Large Language Models

Add code
Jul 23, 2024
Viaarxiv icon

Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Add code
Jun 07, 2024
Viaarxiv icon

D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models

Add code
Jun 03, 2024
Viaarxiv icon

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

Add code
Apr 17, 2022
Figure 1 for PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems
Figure 2 for PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems
Figure 3 for PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems
Figure 4 for PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems
Viaarxiv icon

Exploring Sparse Expert Models and Beyond

Add code
Jun 14, 2021
Figure 1 for Exploring Sparse Expert Models and Beyond
Figure 2 for Exploring Sparse Expert Models and Beyond
Figure 3 for Exploring Sparse Expert Models and Beyond
Figure 4 for Exploring Sparse Expert Models and Beyond
Viaarxiv icon

Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training

Add code
Apr 19, 2021
Figure 1 for Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Figure 2 for Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Figure 3 for Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Figure 4 for Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Viaarxiv icon