Picture for Stas Bekman

Stas Bekman

Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training

Add code
Jun 27, 2024
Viaarxiv icon

The Case for Co-Designing Model Architectures with Hardware

Add code
Jan 30, 2024
Viaarxiv icon

OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

Add code
Jun 21, 2023
Figure 1 for OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Figure 2 for OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Figure 3 for OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Figure 4 for OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

What Language Model to Train if You Have One Million GPU Hours?

Add code
Nov 08, 2022
Figure 1 for What Language Model to Train if You Have One Million GPU Hours?
Figure 2 for What Language Model to Train if You Have One Million GPU Hours?
Figure 3 for What Language Model to Train if You Have One Million GPU Hours?
Figure 4 for What Language Model to Train if You Have One Million GPU Hours?
Viaarxiv icon

Datasets: A Community Library for Natural Language Processing

Add code
Sep 07, 2021
Figure 1 for Datasets: A Community Library for Natural Language Processing
Figure 2 for Datasets: A Community Library for Natural Language Processing
Figure 3 for Datasets: A Community Library for Natural Language Processing
Viaarxiv icon