Picture for Margaret Mitchell

Margaret Mitchell

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Add code
Jun 25, 2024
Viaarxiv icon

CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models

Add code
May 22, 2024
Viaarxiv icon

Evaluating the Social Impact of Generative AI Systems in Systems and Society

Add code
Jun 12, 2023
Viaarxiv icon

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Add code
Mar 07, 2023
Figure 1 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 2 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 3 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 4 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Viaarxiv icon

Measuring Data

Add code
Dec 09, 2022
Figure 1 for Measuring Data
Figure 2 for Measuring Data
Figure 3 for Measuring Data
Viaarxiv icon

The Stack: 3 TB of permissively licensed source code

Add code
Nov 20, 2022
Figure 1 for The Stack: 3 TB of permissively licensed source code
Figure 2 for The Stack: 3 TB of permissively licensed source code
Figure 3 for The Stack: 3 TB of permissively licensed source code
Figure 4 for The Stack: 3 TB of permissively licensed source code
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

A Human Rights-Based Approach to Responsible AI

Add code
Oct 06, 2022
Figure 1 for A Human Rights-Based Approach to Responsible AI
Viaarxiv icon

Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements

Add code
Oct 06, 2022
Figure 1 for Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
Figure 2 for Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
Figure 3 for Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
Viaarxiv icon

Measuring Model Biases in the Absence of Ground Truth

Add code
Mar 05, 2021
Figure 1 for Measuring Model Biases in the Absence of Ground Truth
Figure 2 for Measuring Model Biases in the Absence of Ground Truth
Figure 3 for Measuring Model Biases in the Absence of Ground Truth
Figure 4 for Measuring Model Biases in the Absence of Ground Truth
Viaarxiv icon