Picture for Stella Biderman

Stella Biderman

Every Eval Ever: A Unifying Schema and Community Repository for AI Evaluation Results

Add code
Jun 12, 2026
Viaarxiv icon

LoRA-Muon: Spectral Steepest Descent on the Low-Rank Manifold

Add code
Jun 11, 2026
Viaarxiv icon

Bergson: An Open Source Library for Data Attribution

Add code
Jun 10, 2026
Viaarxiv icon

Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting

Add code
Jun 09, 2026
Viaarxiv icon

Position: Don't Just "Fix it in Post": A Science of AI Must Study Training Dynamics

Add code
Jun 03, 2026
Viaarxiv icon

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

Add code
Feb 18, 2026
Viaarxiv icon

Adversarial Samples Are Not Created Equal

Add code
Jan 02, 2026
Viaarxiv icon

Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations

Add code
Nov 06, 2025
Figure 1 for Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Figure 2 for Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Figure 3 for Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Figure 4 for Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
Viaarxiv icon

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Add code
Jun 05, 2025
Viaarxiv icon

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Add code
May 17, 2025
Viaarxiv icon