Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rahul Krishnan

Estimating Treatment Effects with Independent Component Analysis

Jul 22, 2025

Patrik Reizinger, Lester Mackey, Wieland Brendel, Rahul Krishnan

Abstract:The field of causal inference has developed a variety of methods to accurately estimate treatment effects in the presence of nuisance. Meanwhile, the field of identifiability theory has developed methods like Independent Component Analysis (ICA) to identify latent sources and mixing weights from data. While these two research communities have developed largely independently, they aim to achieve similar goals: the accurate and sample-efficient estimation of model parameters. In the partially linear regression (PLR) setting, Mackey et al. (2018) recently found that estimation consistency can be improved with non-Gaussian treatment noise. Non-Gaussianity is also a crucial assumption for identifying latent factors in ICA. We provide the first theoretical and empirical insights into this connection, showing that ICA can be used for causal effect estimation in the PLR model. Surprisingly, we find that linear ICA can accurately estimate multiple treatment effects even in the presence of Gaussian confounders or nonlinear nuisance.

Via

Access Paper or Ask Questions

Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry

May 08, 2025

Mohammed Adnan, Rohan Jain, Ekansh Sharma, Rahul Krishnan, Yani Ioannou

Abstract:The Lottery Ticket Hypothesis (LTH) suggests there exists a sparse LTH mask and weights that achieve the same generalization performance as the dense model while using significantly fewer parameters. However, finding a LTH solution is computationally expensive, and a LTH sparsity mask does not generalize to other random weight initializations. Recent work has suggested that neural networks trained from random initialization find solutions within the same basin modulo permutation, and proposes a method to align trained models within the same loss basin. We hypothesize that misalignment of basins is the reason why LTH masks do not generalize to new random initializations and propose permuting the LTH mask to align with the new optimization basin when performing sparse training from a different random init. We empirically show a significant increase in generalization when sparse training from random initialization with the permuted mask as compared to using the non-permuted LTH mask, on multiple datasets (CIFAR-10, CIFAR-100 and ImageNet) and models (VGG11, ResNet20 and ResNet50).

* Accepted at ICML 2025

Via

Access Paper or Ask Questions