Picture for Naomi Saphra

Naomi Saphra

A Taxonomy of Transcendence

Add code
Aug 25, 2025
Viaarxiv icon

Can Interpretation Predict Behavior on Unseen Data?

Add code
Jul 08, 2025
Viaarxiv icon

Interpreting the Linear Structure of Vision-language Model Embedding Spaces

Add code
Apr 16, 2025
Figure 1 for Interpreting the Linear Structure of Vision-language Model Embedding Spaces
Figure 2 for Interpreting the Linear Structure of Vision-language Model Embedding Spaces
Figure 3 for Interpreting the Linear Structure of Vision-language Model Embedding Spaces
Figure 4 for Interpreting the Linear Structure of Vision-language Model Embedding Spaces
Viaarxiv icon

PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs

Add code
Mar 12, 2025
Figure 1 for PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Figure 2 for PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Figure 3 for PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Figure 4 for PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Viaarxiv icon

Distributional Scaling Laws for Emergent Capabilities

Add code
Feb 24, 2025
Viaarxiv icon

Sometimes I am a Tree: Data Drives Unstable Hierarchical Generalization

Add code
Dec 05, 2024
Viaarxiv icon

Mechanistic?

Add code
Oct 07, 2024
Viaarxiv icon

Fast Forwarding Low-Rank Training

Add code
Sep 06, 2024
Figure 1 for Fast Forwarding Low-Rank Training
Figure 2 for Fast Forwarding Low-Rank Training
Figure 3 for Fast Forwarding Low-Rank Training
Figure 4 for Fast Forwarding Low-Rank Training
Viaarxiv icon

Benchmarks as Microscopes: A Call for Model Metrology

Add code
Jul 22, 2024
Viaarxiv icon

ChatGPT Doesn't Trust Chargers Fans: Guardrail Sensitivity in Context

Add code
Jul 10, 2024
Viaarxiv icon