Picture for William Merrill

William Merrill

Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG

Add code
Jun 18, 2024
Figure 1 for Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG
Figure 2 for Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG
Figure 3 for Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG
Figure 4 for Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG
Viaarxiv icon

Let's Think Dot by Dot: Hidden Computation in Transformer Language Models

Add code
Apr 24, 2024
Viaarxiv icon

The Illusion of State in State-Space Models

Add code
Apr 12, 2024
Viaarxiv icon

Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment

Add code
Feb 29, 2024
Figure 1 for Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Figure 2 for Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Figure 3 for Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Figure 4 for Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Feb 07, 2024
Figure 1 for OLMo: Accelerating the Science of Language Models
Figure 2 for OLMo: Accelerating the Science of Language Models
Figure 3 for OLMo: Accelerating the Science of Language Models
Figure 4 for OLMo: Accelerating the Science of Language Models
Viaarxiv icon

Transformers as Recognizers of Formal Languages: A Survey on Expressivity

Add code
Nov 01, 2023
Viaarxiv icon

The Expressive Power of Transformers with Chain of Thought

Add code
Oct 18, 2023
Figure 1 for The Expressive Power of Transformers with Chain of Thought
Viaarxiv icon

How Language Model Hallucinations Can Snowball

Add code
May 22, 2023
Figure 1 for How Language Model Hallucinations Can Snowball
Figure 2 for How Language Model Hallucinations Can Snowball
Figure 3 for How Language Model Hallucinations Can Snowball
Figure 4 for How Language Model Hallucinations Can Snowball
Viaarxiv icon

A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks

Add code
Mar 21, 2023
Figure 1 for A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks
Figure 2 for A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks
Figure 3 for A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks
Figure 4 for A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks
Viaarxiv icon

Transparency Helps Reveal When Language Models Learn Meaning

Add code
Oct 14, 2022
Figure 1 for Transparency Helps Reveal When Language Models Learn Meaning
Figure 2 for Transparency Helps Reveal When Language Models Learn Meaning
Figure 3 for Transparency Helps Reveal When Language Models Learn Meaning
Figure 4 for Transparency Helps Reveal When Language Models Learn Meaning
Viaarxiv icon