Picture for Stella Biderman

Stella Biderman

LLM Circuit Analyses Are Consistent Across Training and Scale

Add code
Jul 15, 2024
Viaarxiv icon

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

Add code
Jun 26, 2024
Viaarxiv icon

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

Add code
Jun 25, 2024
Figure 1 for Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
Figure 2 for Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
Figure 3 for Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
Figure 4 for Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
Viaarxiv icon

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?

Add code
Jun 06, 2024
Figure 1 for Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
Figure 2 for Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
Figure 3 for Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
Figure 4 for Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
Viaarxiv icon

Lessons from the Trenches on Reproducible Evaluation of Language Models

Add code
May 23, 2024
Figure 1 for Lessons from the Trenches on Reproducible Evaluation of Language Models
Figure 2 for Lessons from the Trenches on Reproducible Evaluation of Language Models
Figure 3 for Lessons from the Trenches on Reproducible Evaluation of Language Models
Figure 4 for Lessons from the Trenches on Reproducible Evaluation of Language Models
Viaarxiv icon

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Add code
Apr 10, 2024
Viaarxiv icon

Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection

Add code
Mar 23, 2024
Figure 1 for Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Figure 2 for Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Figure 3 for Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Figure 4 for Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection
Viaarxiv icon

On the Societal Impact of Open Foundation Models

Add code
Feb 27, 2024
Figure 1 for On the Societal Impact of Open Foundation Models
Figure 2 for On the Societal Impact of Open Foundation Models
Viaarxiv icon

KMMLU: Measuring Massive Multitask Language Understanding in Korean

Add code
Feb 18, 2024
Viaarxiv icon

Suppressing Pink Elephants with Direct Principle Feedback

Add code
Feb 13, 2024
Figure 1 for Suppressing Pink Elephants with Direct Principle Feedback
Figure 2 for Suppressing Pink Elephants with Direct Principle Feedback
Figure 3 for Suppressing Pink Elephants with Direct Principle Feedback
Figure 4 for Suppressing Pink Elephants with Direct Principle Feedback
Viaarxiv icon