Picture for Colin Raffel

Colin Raffel

Shammie

A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

Add code
Aug 13, 2024
Viaarxiv icon

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Add code
Jun 25, 2024
Figure 1 for The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Figure 2 for The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Figure 3 for The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Figure 4 for The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Viaarxiv icon

Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models

Add code
Apr 08, 2024
Figure 1 for Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Figure 2 for Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Figure 3 for Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Figure 4 for Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Viaarxiv icon

A Survey on Data Selection for Language Models

Add code
Mar 08, 2024
Figure 1 for A Survey on Data Selection for Language Models
Figure 2 for A Survey on Data Selection for Language Models
Figure 3 for A Survey on Data Selection for Language Models
Figure 4 for A Survey on Data Selection for Language Models
Viaarxiv icon

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Add code
Feb 16, 2024
Viaarxiv icon

Learning to Route Among Specialized Experts for Zero-Shot Generalization

Add code
Feb 08, 2024
Viaarxiv icon

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Add code
Dec 13, 2023
Figure 1 for Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Figure 2 for Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Figure 3 for Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Figure 4 for Distributed Inference and Fine-tuning of Large Language Models Over The Internet
Viaarxiv icon

Merging by Matching Models in Task Subspaces

Add code
Dec 07, 2023
Figure 1 for Merging by Matching Models in Task Subspaces
Figure 2 for Merging by Matching Models in Task Subspaces
Figure 3 for Merging by Matching Models in Task Subspaces
Figure 4 for Merging by Matching Models in Task Subspaces
Viaarxiv icon

Efficient Online Data Mixing For Language Model Pre-Training

Add code
Dec 05, 2023
Figure 1 for Efficient Online Data Mixing For Language Model Pre-Training
Figure 2 for Efficient Online Data Mixing For Language Model Pre-Training
Figure 3 for Efficient Online Data Mixing For Language Model Pre-Training
Figure 4 for Efficient Online Data Mixing For Language Model Pre-Training
Viaarxiv icon

ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization

Add code
Nov 22, 2023
Figure 1 for ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Figure 2 for ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Figure 3 for ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Figure 4 for ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Viaarxiv icon