Picture for Tuomas Oikarinen

Tuomas Oikarinen

Lily

Faithful and Stable Neuron Explanations for Trustworthy Mechanistic Interpretability

Add code
Dec 19, 2025
Viaarxiv icon

Rethinking Crowd-Sourced Evaluation of Neuron Explanations

Add code
Jun 09, 2025
Viaarxiv icon

Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

Add code
Jun 06, 2025
Viaarxiv icon

Interpretable Generative Models through Post-hoc Concept Bottlenecks

Add code
Mar 25, 2025
Viaarxiv icon

Concept Bottleneck Large Language Models

Add code
Dec 11, 2024
Figure 1 for Concept Bottleneck Large Language Models
Figure 2 for Concept Bottleneck Large Language Models
Figure 3 for Concept Bottleneck Large Language Models
Figure 4 for Concept Bottleneck Large Language Models
Viaarxiv icon

Concept Bottleneck Language Models For protein design

Add code
Nov 09, 2024
Figure 1 for Concept Bottleneck Language Models For protein design
Figure 2 for Concept Bottleneck Language Models For protein design
Figure 3 for Concept Bottleneck Language Models For protein design
Figure 4 for Concept Bottleneck Language Models For protein design
Viaarxiv icon

Crafting Large Language Models for Enhanced Interpretability

Add code
Jul 05, 2024
Figure 1 for Crafting Large Language Models for Enhanced Interpretability
Figure 2 for Crafting Large Language Models for Enhanced Interpretability
Figure 3 for Crafting Large Language Models for Enhanced Interpretability
Figure 4 for Crafting Large Language Models for Enhanced Interpretability
Viaarxiv icon

Linear Explanations for Individual Neurons

Add code
May 10, 2024
Figure 1 for Linear Explanations for Individual Neurons
Figure 2 for Linear Explanations for Individual Neurons
Figure 3 for Linear Explanations for Individual Neurons
Figure 4 for Linear Explanations for Individual Neurons
Viaarxiv icon

Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models

Add code
Mar 20, 2024
Figure 1 for Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models
Figure 2 for Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models
Figure 3 for Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models
Figure 4 for Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models
Viaarxiv icon

Corrupting Neuron Explanations of Deep Visual Features

Add code
Oct 25, 2023
Viaarxiv icon