Picture for Raphael Sarfati

Raphael Sarfati

Anatomy of Post-Training: Using Interpretability to Characterize Data and Shape the Learning Signal

Add code
Jun 10, 2026
Viaarxiv icon

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior

Add code
May 06, 2026
Viaarxiv icon

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

Add code
Mar 05, 2026
Viaarxiv icon

Tracking and triangulating firefly flashes in field recordings

Add code
Oct 25, 2024
Viaarxiv icon