Picture for Sid Black

Sid Black

Do Large Language Models Know What They Are Capable Of?

Add code
Dec 31, 2025
Viaarxiv icon

Auditing Games for Sandbagging

Add code
Dec 08, 2025
Viaarxiv icon

RepliBench: Evaluating the autonomous replication capabilities of language model agents

Add code
Apr 21, 2025
Viaarxiv icon

Interpreting Neural Networks through the Polytope Lens

Add code
Nov 22, 2022
Viaarxiv icon

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

Add code
Apr 14, 2022
Figure 1 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 2 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 3 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 4 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Viaarxiv icon

The Pile: An 800GB Dataset of Diverse Text for Language Modeling

Add code
Dec 31, 2020
Figure 1 for The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Figure 2 for The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Figure 3 for The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Figure 4 for The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Viaarxiv icon