Picture for Julien Launay

Julien Launay

The Falcon Series of Open Language Models

Add code
Nov 29, 2023
Figure 1 for The Falcon Series of Open Language Models
Figure 2 for The Falcon Series of Open Language Models
Figure 3 for The Falcon Series of Open Language Models
Figure 4 for The Falcon Series of Open Language Models
Viaarxiv icon

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Add code
Jun 01, 2023
Figure 1 for The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Figure 2 for The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Figure 3 for The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Figure 4 for The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

What Language Model to Train if You Have One Million GPU Hours?

Add code
Nov 08, 2022
Figure 1 for What Language Model to Train if You Have One Million GPU Hours?
Figure 2 for What Language Model to Train if You Have One Million GPU Hours?
Figure 3 for What Language Model to Train if You Have One Million GPU Hours?
Figure 4 for What Language Model to Train if You Have One Million GPU Hours?
Viaarxiv icon

Scaling Laws Beyond Backpropagation

Add code
Oct 26, 2022
Figure 1 for Scaling Laws Beyond Backpropagation
Figure 2 for Scaling Laws Beyond Backpropagation
Figure 3 for Scaling Laws Beyond Backpropagation
Figure 4 for Scaling Laws Beyond Backpropagation
Viaarxiv icon

What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?

Add code
Apr 12, 2022
Figure 1 for What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Figure 2 for What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Figure 3 for What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Figure 4 for What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Viaarxiv icon

PAGnol: An Extra-Large French Generative Model

Add code
Oct 16, 2021
Figure 1 for PAGnol: An Extra-Large French Generative Model
Figure 2 for PAGnol: An Extra-Large French Generative Model
Figure 3 for PAGnol: An Extra-Large French Generative Model
Figure 4 for PAGnol: An Extra-Large French Generative Model
Viaarxiv icon

Is the Number of Trainable Parameters All That Actually Matters?

Add code
Sep 24, 2021
Figure 1 for Is the Number of Trainable Parameters All That Actually Matters?
Figure 2 for Is the Number of Trainable Parameters All That Actually Matters?
Figure 3 for Is the Number of Trainable Parameters All That Actually Matters?
Figure 4 for Is the Number of Trainable Parameters All That Actually Matters?
Viaarxiv icon

Photonic Differential Privacy with Direct Feedback Alignment

Add code
Jun 07, 2021
Figure 1 for Photonic Differential Privacy with Direct Feedback Alignment
Figure 2 for Photonic Differential Privacy with Direct Feedback Alignment
Figure 3 for Photonic Differential Privacy with Direct Feedback Alignment
Viaarxiv icon

Adversarial Robustness by Design through Analog Computing and Synthetic Gradients

Add code
Jan 06, 2021
Figure 1 for Adversarial Robustness by Design through Analog Computing and Synthetic Gradients
Figure 2 for Adversarial Robustness by Design through Analog Computing and Synthetic Gradients
Figure 3 for Adversarial Robustness by Design through Analog Computing and Synthetic Gradients
Figure 4 for Adversarial Robustness by Design through Analog Computing and Synthetic Gradients
Viaarxiv icon