Picture for Lev McKinney

Lev McKinney

An Independent Safety Evaluation of Kimi K2.5

Add code
Apr 03, 2026
Viaarxiv icon

Gauss-Newton Unlearning for the LLM Era

Add code
Feb 11, 2026
Viaarxiv icon

Eliciting Latent Predictions from Transformers with the Tuned Lens

Add code
Mar 15, 2023
Figure 1 for Eliciting Latent Predictions from Transformers with the Tuned Lens
Figure 2 for Eliciting Latent Predictions from Transformers with the Tuned Lens
Figure 3 for Eliciting Latent Predictions from Transformers with the Tuned Lens
Figure 4 for Eliciting Latent Predictions from Transformers with the Tuned Lens
Viaarxiv icon

On The Fragility of Learned Reward Functions

Add code
Jan 09, 2023
Viaarxiv icon