Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Frederick Eberhardt

Lower Bounds on the Size of Markov Equivalence Classes

Jun 26, 2025

Erik Jahn, Frederick Eberhardt, Leonard J. Schulman

Abstract:Causal discovery algorithms typically recover causal graphs only up to their Markov equivalence classes unless additional parametric assumptions are made. The sizes of these equivalence classes reflect the limits of what can be learned about the underlying causal graph from purely observational data. Under the assumptions of acyclicity, causal sufficiency, and a uniform model prior, Markov equivalence classes are known to be small on average. In this paper, we show that this is no longer the case when any of these assumptions is relaxed. Specifically, we prove exponentially large lower bounds for the expected size of Markov equivalence classes in three settings: sparse random directed acyclic graphs, uniformly random acyclic directed mixed graphs, and uniformly random directed cyclic graphs.

Via

Access Paper or Ask Questions

Modeling Discrimination with Causal Abstraction

Jan 14, 2025

Milan Mossé, Kara Schechtman, Frederick Eberhardt, Thomas Icard

Abstract:A person is directly racially discriminated against only if her race caused her worse treatment. This implies that race is an attribute sufficiently separable from other attributes to isolate its causal role. But race is embedded in a nexus of social factors that resist isolated treatment. If race is socially constructed, in what sense can it cause worse treatment? Some propose that the perception of race, rather than race itself, causes worse treatment. Others suggest that since causal models require modularity, i.e. the ability to isolate causal effects, attempts to causally model discrimination are misguided. This paper addresses the problem differently. We introduce a framework for reasoning about discrimination, in which race is a high-level abstraction of lower-level features. In this framework, race can be modeled as itself causing worse treatment. Modularity is ensured by allowing assumptions about social construction to be precisely and explicitly stated, via an alignment between race and its constituents. Such assumptions can then be subjected to normative and empirical challenges, which lead to different views of when discrimination occurs. By distinguishing constitutive and causal relations, the abstraction framework pinpoints disagreements in the current literature on modeling discrimination, while preserving a precise causal account of discrimination.

Via

Access Paper or Ask Questions

Controlling for discrete unmeasured confounding in nonlinear causal models

Aug 10, 2024

Patrick Burauel, Frederick Eberhardt, Michel Besserve

Figure 1 for Controlling for discrete unmeasured confounding in nonlinear causal models

Figure 2 for Controlling for discrete unmeasured confounding in nonlinear causal models

Figure 3 for Controlling for discrete unmeasured confounding in nonlinear causal models

Figure 4 for Controlling for discrete unmeasured confounding in nonlinear causal models

Abstract:Unmeasured confounding is a major challenge for identifying causal relationships from non-experimental data. Here, we propose a method that can accommodate unmeasured discrete confounding. Extending recent identifiability results in deep latent variable models, we show theoretically that confounding can be detected and corrected under the assumption that the observed data is a piecewise affine transformation of a latent Gaussian mixture model and that the identity of the mixture components is confounded. We provide a flow-based algorithm to estimate this model and perform deconfounding. Experimental results on synthetic and real-world data provide support for the effectiveness of our approach.

Via

Access Paper or Ask Questions

Approximate Causal Abstraction

Jun 29, 2019

Sander Beckers, Frederick Eberhardt, Joseph Y. Halpern

Figure 1 for Approximate Causal Abstraction

Figure 2 for Approximate Causal Abstraction

Figure 3 for Approximate Causal Abstraction

Figure 4 for Approximate Causal Abstraction

Abstract:Scientific models describe natural phenomena at different levels of abstraction. Abstract descriptions can provide the basis for interventions on the system and explanation of observed phenomena at a level of granularity that is coarser than the most fundamental account of the system. Beckers and Halpern (2019), building on work of Rubenstein et al. (2017), developed an account of abstraction for causal models that is exact. Here we extend this account to the more realistic case where an abstract causal model offers only an approximation of the underlying system. We show how the resulting account handles the discrepancy that can arise between low- and high-level causal models of the same system, and in the process provide an account of how one causal model approximates another, a topic of independent interest. Finally, we extend the account of approximate abstractions to probabilistic causal models, indicating how and where uncertainty can enter into an approximate abstraction.

* Appears in UAI-2019

Via

Access Paper or Ask Questions

ASP-based Discovery of Semi-Markovian Causal Models under Weaker Assumptions

Jun 06, 2019

Zhalama, Jiji Zhang, Frederick Eberhardt, Wolfgang Mayer, Mark Junjie Li

Figure 1 for ASP-based Discovery of Semi-Markovian Causal Models under Weaker Assumptions

Figure 2 for ASP-based Discovery of Semi-Markovian Causal Models under Weaker Assumptions

Figure 3 for ASP-based Discovery of Semi-Markovian Causal Models under Weaker Assumptions

Figure 4 for ASP-based Discovery of Semi-Markovian Causal Models under Weaker Assumptions

Abstract:In recent years the possibility of relaxing the so-called Faithfulness assumption in automated causal discovery has been investigated. The investigation showed (1) that the Faithfulness assumption can be weakened in various ways that in an important sense preserve its power, and (2) that weakening of Faithfulness may help to speed up methods based on Answer Set Programming. However, this line of work has so far only considered the discovery of causal models without latent variables. In this paper, we study weakenings of Faithfulness for constraint-based discovery of semi-Markovian causal models, which accommodate the possibility of latent variables, and show that both (1) and (2) remain the case in this more realistic setting.

* 12 pages, 6 figures, IJCAI 2019

Via

Access Paper or Ask Questions

Fast Conditional Independence Test for Vector Variables with Large Sample Sizes

Apr 08, 2018

Krzysztof Chalupka, Pietro Perona, Frederick Eberhardt

Figure 1 for Fast Conditional Independence Test for Vector Variables with Large Sample Sizes

Figure 2 for Fast Conditional Independence Test for Vector Variables with Large Sample Sizes

Figure 3 for Fast Conditional Independence Test for Vector Variables with Large Sample Sizes

Figure 4 for Fast Conditional Independence Test for Vector Variables with Large Sample Sizes

Abstract:We present and evaluate the Fast (conditional) Independence Test (FIT) -- a nonparametric conditional independence test. The test is based on the idea that when $P(X \mid Y, Z) = P(X \mid Y)$, $Z$ is not useful as a feature to predict $X$, as long as $Y$ is also a regressor. On the contrary, if $P(X \mid Y, Z) \neq P(X \mid Y)$, $Z$ might improve prediction results. FIT applies to thousand-dimensional random variables with a hundred thousand samples in a fraction of the time required by alternative methods. We provide an extensive evaluation that compares FIT to six extant nonparametric independence tests. The evaluation shows that FIT has low probability of making both Type I and Type II errors compared to other tests, especially as the number of available samples grows. Our implementation of FIT is publicly available.

Via

Access Paper or Ask Questions

Estimating Causal Direction and Confounding of Two Discrete Variables

Nov 04, 2016

Krzysztof Chalupka, Frederick Eberhardt, Pietro Perona

Figure 1 for Estimating Causal Direction and Confounding of Two Discrete Variables

Figure 2 for Estimating Causal Direction and Confounding of Two Discrete Variables

Figure 3 for Estimating Causal Direction and Confounding of Two Discrete Variables

Figure 4 for Estimating Causal Direction and Confounding of Two Discrete Variables

Abstract:We propose a method to classify the causal relationship between two discrete variables given only the joint distribution of the variables, acknowledging that the method is subject to an inherent baseline error. We assume that the causal system is acyclicity, but we do allow for hidden common causes. Our algorithm presupposes that the probability distributions $P(C)$ of a cause $C$ is independent from the probability distribution $P(E\mid C)$ of the cause-effect mechanism. While our classifier is trained with a Bayesian assumption of flat hyperpriors, we do not make this assumption about our test data. This work connects to recent developments on the identifiability of causal models over continuous variables under the assumption of "independent mechanisms". Carefully-commented Python notebooks that reproduce all our experiments are available online at http://vision.caltech.edu/~kchalupk/code.html.

Via

Access Paper or Ask Questions

Causal Discovery from Subsampled Time Series Data by Constraint Optimization

Jul 13, 2016

Antti Hyttinen, Sergey Plis, Matti Järvisalo, Frederick Eberhardt, David Danks

Figure 1 for Causal Discovery from Subsampled Time Series Data by Constraint Optimization

Figure 2 for Causal Discovery from Subsampled Time Series Data by Constraint Optimization

Figure 3 for Causal Discovery from Subsampled Time Series Data by Constraint Optimization

Figure 4 for Causal Discovery from Subsampled Time Series Data by Constraint Optimization

Abstract:This paper focuses on causal structure estimation from time series data in which measurements are obtained at a coarser timescale than the causal timescale of the underlying system. Previous work has shown that such subsampling can lead to significant errors about the system's causal structure if not properly taken into account. In this paper, we first consider the search for the system timescale causal structures that correspond to a given measurement timescale structure. We provide a constraint satisfaction procedure whose computational performance is several orders of magnitude better than previous approaches. We then consider finite-sample data as input, and propose the first constraint optimization approach for recovering the system timescale causal structure. This algorithm optimally recovers from possible conflicts due to statistical errors. More generally, these advances allow for a robust and non-parametric estimation of system timescale causal structures from subsampled time series data.

* International Conference on Probabilistic Graphical Models, PGM 2016

Via

Access Paper or Ask Questions

Unsupervised Discovery of El Nino Using Causal Feature Learning on Microlevel Climate Data

May 30, 2016

Krzysztof Chalupka, Tobias Bischoff, Pietro Perona, Frederick Eberhardt

Figure 1 for Unsupervised Discovery of El Nino Using Causal Feature Learning on Microlevel Climate Data

Figure 2 for Unsupervised Discovery of El Nino Using Causal Feature Learning on Microlevel Climate Data

Figure 3 for Unsupervised Discovery of El Nino Using Causal Feature Learning on Microlevel Climate Data

Figure 4 for Unsupervised Discovery of El Nino Using Causal Feature Learning on Microlevel Climate Data

Abstract:We show that the climate phenomena of El Nino and La Nina arise naturally as states of macro-variables when our recent causal feature learning framework (Chalupka 2015, Chalupka 2016) is applied to micro-level measures of zonal wind (ZW) and sea surface temperatures (SST) taken over the equatorial band of the Pacific Ocean. The method identifies these unusual climate states on the basis of the relation between ZW and SST patterns without any input about past occurrences of El Nino or La Nina. The simpler alternatives of (i) clustering the SST fields while disregarding their relationship with ZW patterns, or (ii) clustering the joint ZW-SST patterns, do not discover El Nino. We discuss the degree to which our method supports a causal interpretation and use a low-dimensional toy example to explain its success over other clustering approaches. Finally, we propose a new robust and scalable alternative to our original algorithm (Chalupka 2016), which circumvents the need for high-dimensional density learning.

* Accepted for plenary presentation at UAI 2016

Via

Access Paper or Ask Questions

Multi-Level Cause-Effect Systems

Dec 25, 2015

Krzysztof Chalupka, Pietro Perona, Frederick Eberhardt

Figure 1 for Multi-Level Cause-Effect Systems

Figure 2 for Multi-Level Cause-Effect Systems

Figure 3 for Multi-Level Cause-Effect Systems

Figure 4 for Multi-Level Cause-Effect Systems

Abstract:We present a domain-general account of causation that applies to settings in which macro-level causal relations between two systems are of interest, but the relevant causal features are poorly understood and have to be aggregated from vast arrays of micro-measurements. Our approach generalizes that of Chalupka et al. (2015) to the setting in which the macro-level effect is not specified. We formalize the connection between micro- and macro-variables in such situations and provide a coherent framework describing causal relations at multiple levels of analysis. We present an algorithm that discovers macro-variable causes and effects from micro-level measurements obtained from an experiment. We further show how to design experiments to discover macro-variables from observational micro-variable data. Finally, we show that under specific conditions, one can identify multiple levels of causal structure. Throughout the article, we use a simulated neuroscience multi-unit recording experiment to illustrate the ideas and the algorithms.

Via

Access Paper or Ask Questions