Picture for Kyle Lo

Kyle Lo

Allen Institute for Artificial Intelligence

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

Add code
Jun 26, 2024
Viaarxiv icon

One Thousand and One Pairs: A "novel" challenge for long-context language models

Add code
Jun 24, 2024
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Viaarxiv icon

SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

Add code
Jun 10, 2024
Viaarxiv icon

FABLES: Evaluating faithfulness and content selection in book-length summarization

Add code
Apr 01, 2024
Figure 1 for FABLES: Evaluating faithfulness and content selection in book-length summarization
Figure 2 for FABLES: Evaluating faithfulness and content selection in book-length summarization
Figure 3 for FABLES: Evaluating faithfulness and content selection in book-length summarization
Figure 4 for FABLES: Evaluating faithfulness and content selection in book-length summarization
Viaarxiv icon

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

Add code
Mar 22, 2024
Viaarxiv icon

KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions

Add code
Mar 06, 2024
Figure 1 for KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions
Figure 2 for KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions
Figure 3 for KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions
Figure 4 for KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions
Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Feb 07, 2024
Figure 1 for OLMo: Accelerating the Science of Language Models
Figure 2 for OLMo: Accelerating the Science of Language Models
Figure 3 for OLMo: Accelerating the Science of Language Models
Figure 4 for OLMo: Accelerating the Science of Language Models
Viaarxiv icon

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Add code
Jan 31, 2024
Figure 1 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 2 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 3 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 4 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Viaarxiv icon

InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification

Add code
Jan 29, 2024
Viaarxiv icon