Codebook Features: Sparse and Discrete Interpretability for Neural Networks

Add code
Oct 26, 2023
Figure 1 for Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Figure 2 for Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Figure 3 for Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Figure 4 for Codebook Features: Sparse and Discrete Interpretability for Neural Networks

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: