Picture for Connor Kissane

Connor Kissane

Auditing Games for Sandbagging

Add code
Dec 08, 2025
Viaarxiv icon

Interpreting Attention Layer Outputs with Sparse Autoencoders

Add code
Jun 25, 2024
Viaarxiv icon