Alert button
Picture for Esben Kran

Esben Kran

Alert button

DeepDecipher: Accessing and Investigating Neuron Activation in Large Language Models

Add code
Bookmark button
Alert button
Oct 03, 2023
Albert Garde, Esben Kran, Fazl Barez

Viaarxiv icon

Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark

Add code
Bookmark button
Alert button
Jun 03, 2023
Jason Hoelscher-Obermaier, Julia Persson, Esben Kran, Ioannis Konstas, Fazl Barez

Figure 1 for Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark
Figure 2 for Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark
Figure 3 for Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark
Figure 4 for Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark
Viaarxiv icon

Neuron to Graph: Interpreting Language Model Neurons at Scale

Add code
Bookmark button
Alert button
May 31, 2023
Alex Foote, Neel Nanda, Esben Kran, Ioannis Konstas, Shay Cohen, Fazl Barez

Figure 1 for Neuron to Graph: Interpreting Language Model Neurons at Scale
Figure 2 for Neuron to Graph: Interpreting Language Model Neurons at Scale
Figure 3 for Neuron to Graph: Interpreting Language Model Neurons at Scale
Figure 4 for Neuron to Graph: Interpreting Language Model Neurons at Scale
Viaarxiv icon

N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models

Add code
Bookmark button
Alert button
Apr 22, 2023
Alex Foote, Neel Nanda, Esben Kran, Ionnis Konstas, Fazl Barez

Figure 1 for N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models
Figure 2 for N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models
Figure 3 for N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models
Figure 4 for N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models
Viaarxiv icon