Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Causal Abstractions of Neural Networks

Jun 06, 2021

Atticus Geiger, Hanson Lu, Thomas Icard, Christopher Potts

Figure 1 for Causal Abstractions of Neural Networks

Figure 2 for Causal Abstractions of Neural Networks

Figure 3 for Causal Abstractions of Neural Networks

Figure 4 for Causal Abstractions of Neural Networks

Share this with someone who'll enjoy it:

Abstract:Structural analysis methods (e.g., probing and feature attribution) are increasingly important tools for neural network analysis. We propose a new structural analysis method grounded in a formal theory of \textit{causal abstraction} that provides rich characterizations of model-internal representations and their roles in input/output behavior. In this method, neural representations are aligned with variables in interpretable causal models, and then \textit{interchange interventions} are used to experimentally verify that the neural representations have the causal properties of their aligned variables. We apply this method in a case study to analyze neural models trained on Multiply Quantified Natural Language Inference (MQNLI) corpus, a highly complex NLI dataset that was constructed with a tree-structured natural logic causal model. We discover that a BERT-based model with state-of-the-art performance successfully realizes the approximate causal structure of the natural logic causal model, whereas a simpler baseline model fails to show any such structure, demonstrating that neural representations encode the compositional structure of MQNLI examples.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Causal Abstractions of Neural Networks

Paper and Code