Picture for Emily Reif

Emily Reif

The Evolution of LLM Adoption in Industry Data Curation Practices

Add code
Dec 20, 2024
Figure 1 for The Evolution of LLM Adoption in Industry Data Curation Practices
Figure 2 for The Evolution of LLM Adoption in Industry Data Curation Practices
Figure 3 for The Evolution of LLM Adoption in Industry Data Curation Practices
Figure 4 for The Evolution of LLM Adoption in Industry Data Curation Practices
Viaarxiv icon

Who's asking? User personas and the mechanics of latent misalignment

Add code
Jun 17, 2024
Figure 1 for Who's asking? User personas and the mechanics of latent misalignment
Figure 2 for Who's asking? User personas and the mechanics of latent misalignment
Figure 3 for Who's asking? User personas and the mechanics of latent misalignment
Figure 4 for Who's asking? User personas and the mechanics of latent misalignment
Viaarxiv icon

Automatic Histograms: Leveraging Language Models for Text Dataset Exploration

Add code
Feb 21, 2024
Viaarxiv icon

Understanding the Dataset Practitioners Behind Large Language Model Development

Add code
Feb 21, 2024
Figure 1 for Understanding the Dataset Practitioners Behind Large Language Model Development
Figure 2 for Understanding the Dataset Practitioners Behind Large Language Model Development
Viaarxiv icon

LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models

Add code
Feb 16, 2024
Figure 1 for LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Figure 2 for LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Figure 3 for LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Viaarxiv icon

SoUnD Framework: Analyzing (So)cial Representation in (Un)structured (D)ata

Add code
Dec 01, 2023
Viaarxiv icon

Data Similarity is Not Enough to Explain Language Model Performance

Add code
Nov 15, 2023
Viaarxiv icon

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Add code
May 22, 2023
Figure 1 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 2 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 3 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Figure 4 for A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Viaarxiv icon

Visualizing Linguistic Diversity of Text Datasets Synthesized by Large Language Models

Add code
May 19, 2023
Viaarxiv icon

PaLM 2 Technical Report

Add code
May 17, 2023
Figure 1 for PaLM 2 Technical Report
Figure 2 for PaLM 2 Technical Report
Figure 3 for PaLM 2 Technical Report
Figure 4 for PaLM 2 Technical Report
Viaarxiv icon