Picture for William Merrill

William Merrill

Michael Pokorny

RELIC: Evaluating Compositional Instruction Following via Language Recognition

Add code
Jun 05, 2025
Viaarxiv icon

Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training

Add code
May 29, 2025
Viaarxiv icon

Exact Expressive Power of Transformers with Padding

Add code
May 25, 2025
Viaarxiv icon

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Add code
Mar 18, 2025
Viaarxiv icon

A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers

Add code
Mar 05, 2025
Viaarxiv icon

Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases

Add code
Feb 26, 2025
Viaarxiv icon

Humanity's Last Exam

Add code
Jan 24, 2025
Viaarxiv icon

2 OLMo 2 Furious

Add code
Dec 31, 2024
Figure 1 for 2 OLMo 2 Furious
Figure 2 for 2 OLMo 2 Furious
Figure 3 for 2 OLMo 2 Furious
Figure 4 for 2 OLMo 2 Furious
Viaarxiv icon

Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG

Add code
Jun 18, 2024
Viaarxiv icon

Let's Think Dot by Dot: Hidden Computation in Transformer Language Models

Add code
Apr 24, 2024
Viaarxiv icon