Picture for Stella Biderman

Stella Biderman

Suppressing Pink Elephants with Direct Principle Feedback

Add code
Feb 13, 2024
Viaarxiv icon

The Case for Co-Designing Model Architectures with Hardware

Add code
Jan 30, 2024
Figure 1 for The Case for Co-Designing Model Architectures with Hardware
Figure 2 for The Case for Co-Designing Model Architectures with Hardware
Figure 3 for The Case for Co-Designing Model Architectures with Hardware
Figure 4 for The Case for Co-Designing Model Architectures with Hardware
Viaarxiv icon

Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion

Add code
Jan 23, 2024
Figure 1 for Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion
Figure 2 for Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion
Figure 3 for Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion
Figure 4 for Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion
Viaarxiv icon

Grokking Group Multiplication with Cosets

Add code
Dec 11, 2023
Figure 1 for Grokking Group Multiplication with Cosets
Figure 2 for Grokking Group Multiplication with Cosets
Figure 3 for Grokking Group Multiplication with Cosets
Figure 4 for Grokking Group Multiplication with Cosets
Viaarxiv icon

Llemma: An Open Language Model For Mathematics

Add code
Oct 16, 2023
Figure 1 for Llemma: An Open Language Model For Mathematics
Figure 2 for Llemma: An Open Language Model For Mathematics
Figure 3 for Llemma: An Open Language Model For Mathematics
Figure 4 for Llemma: An Open Language Model For Mathematics
Viaarxiv icon

Stay on topic with Classifier-Free Guidance

Add code
Jun 30, 2023
Figure 1 for Stay on topic with Classifier-Free Guidance
Figure 2 for Stay on topic with Classifier-Free Guidance
Figure 3 for Stay on topic with Classifier-Free Guidance
Figure 4 for Stay on topic with Classifier-Free Guidance
Viaarxiv icon

LEACE: Perfect linear concept erasure in closed form

Add code
Jun 23, 2023
Figure 1 for LEACE: Perfect linear concept erasure in closed form
Figure 2 for LEACE: Perfect linear concept erasure in closed form
Figure 3 for LEACE: Perfect linear concept erasure in closed form
Figure 4 for LEACE: Perfect linear concept erasure in closed form
Viaarxiv icon

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration

Add code
Jun 02, 2023
Viaarxiv icon

Recasting Self-Attention with Holographic Reduced Representations

Add code
May 31, 2023
Figure 1 for Recasting Self-Attention with Holographic Reduced Representations
Figure 2 for Recasting Self-Attention with Holographic Reduced Representations
Figure 3 for Recasting Self-Attention with Holographic Reduced Representations
Figure 4 for Recasting Self-Attention with Holographic Reduced Representations
Viaarxiv icon

Can Transformers Learn to Solve Problems Recursively?

Add code
May 24, 2023
Figure 1 for Can Transformers Learn to Solve Problems Recursively?
Figure 2 for Can Transformers Learn to Solve Problems Recursively?
Figure 3 for Can Transformers Learn to Solve Problems Recursively?
Figure 4 for Can Transformers Learn to Solve Problems Recursively?
Viaarxiv icon