Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sourav Chatterjee

Pre-validation Revisited

May 21, 2025

Jing Shang, Sourav Chatterjee, Trevor Hastie, Robert Tibshirani

Abstract:Pre-validation is a way to build prediction model with two datasets of significantly different feature dimensions. Previous work showed that the asymptotic distribution of test statistic for the pre-validated predictor deviated from a standard Normal, hence will lead to issues in hypothesis tests. In this paper, we revisited the pre-validation procedure and extended the problem formulation without any independence assumption on the two feature sets. We proposed not only an analytical distribution of the test statistics for pre-validated predictor under certain models, but also a generic bootstrap procedure to conduct inference. We showed properties and benefits of pre-validation in prediction, inference and error estimation by simulation and various applications, including analysis of a breast cancer study and a synthetic GWAS example.

Via

Access Paper or Ask Questions

Non-identifiability distinguishes Neural Networks among Parametric Models

Apr 25, 2025

Sourav Chatterjee, Timothy Sudijono

Abstract:One of the enduring problems surrounding neural networks is to identify the factors that differentiate them from traditional statistical models. We prove a pair of results which distinguish feedforward neural networks among parametric models at the population level, for regression tasks. Firstly, we prove that for any pair of random variables $(X,Y)$, neural networks always learn a nontrivial relationship between $X$ and $Y$, if one exists. Secondly, we prove that for reasonable smooth parametric models, under local and global identifiability conditions, there exists a nontrivial $(X,Y)$ pair for which the parametric model learns the constant predictor $\mathbb{E}[Y]$. Together, our results suggest that a lack of identifiability distinguishes neural networks among the class of smooth parametric models.

* 16 pages. Comments welcome

Via

Access Paper or Ask Questions

A survey of some recent developments in measures of association

Nov 09, 2022

Sourav Chatterjee

Abstract:This paper surveys some recent developments in measures of association related to a new coefficient of correlation introduced by the author. A straightforward extension of this coefficient to standard Borel spaces (which includes all Polish spaces), overlooked in the literature so far, is proposed at the end of the survey.

* 22 pages

Via

Access Paper or Ask Questions

Estimating large causal polytree skeletons from small samples

Sep 15, 2022

Sourav Chatterjee, Mathukumalli Vidyasagar

Figure 1 for Estimating large causal polytree skeletons from small samples

Abstract:We consider the problem of estimating the skeleton of a large causal polytree from a relatively small i.i.d. sample. This is motivated by the problem of determining causal structure when the number of variables is very large compared to the sample size, such as in gene regulatory networks. We give an algorithm that recovers the tree with high accuracy in such settings. The algorithm works under essentially no distributional or modeling assumptions other than some mild non-degeneracy conditions.

* 16 pages

Via

Access Paper or Ask Questions

MOSPAT: AutoML based Model Selection and Parameter Tuning for Time Series Anomaly Detection

May 24, 2022

Sourav Chatterjee, Rohan Bopardikar, Marius Guerard, Uttam Thakore, Xiaodong Jiang

Figure 1 for MOSPAT: AutoML based Model Selection and Parameter Tuning for Time Series Anomaly Detection

Figure 2 for MOSPAT: AutoML based Model Selection and Parameter Tuning for Time Series Anomaly Detection

Figure 3 for MOSPAT: AutoML based Model Selection and Parameter Tuning for Time Series Anomaly Detection

Figure 4 for MOSPAT: AutoML based Model Selection and Parameter Tuning for Time Series Anomaly Detection

Abstract:Organizations leverage anomaly and changepoint detection algorithms to detect changes in user behavior or service availability and performance. Many off-the-shelf detection algorithms, though effective, cannot readily be used in large organizations where thousands of users monitor millions of use cases and metrics with varied time series characteristics and anomaly patterns. The selection of algorithm and parameters needs to be precise for each use case: manual tuning does not scale, and automated tuning requires ground truth, which is rarely available. In this paper, we explore MOSPAT, an end-to-end automated machine learning based approach for model and parameter selection, combined with a generative model to produce labeled data. Our scalable end-to-end system allows individual users in large organizations to tailor time-series monitoring to their specific use case and data characteristics, without expert knowledge of anomaly detection algorithms or laborious manual labeling. Our extensive experiments on real and synthetic data demonstrate that this method consistently outperforms using any single algorithm.

* 10 pages, submitted originally to KDD'22

Via

Access Paper or Ask Questions

Convergence of gradient descent for deep neural networks

Mar 30, 2022

Sourav Chatterjee

Figure 1 for Convergence of gradient descent for deep neural networks

Abstract:Optimization by gradient descent has been one of main drivers of the "deep learning revolution". Yet, despite some recent progress for extremely wide networks, it remains an open problem to understand why gradient descent often converges to global minima when training deep neural networks. This article presents a new criterion for convergence of gradient descent to a global minimum, which is provably more powerful than the best available criteria from the literature, namely, the Lojasiewicz inequality and its generalizations. This criterion is used to show that gradient descent with proper initialization converges to a global minimum when training any feedforward neural network with smooth and strictly increasing activation functions, provided that the input dimension is greater than or equal to the number of data points.

* 29 pages, 1 figure

Via

Access Paper or Ask Questions

Micro-Estimates of Wealth for all Low- and Middle-Income Countries

Apr 15, 2021

Guanghua Chi, Han Fang, Sourav Chatterjee, Joshua E. Blumenstock

Figure 1 for Micro-Estimates of Wealth for all Low- and Middle-Income Countries

Figure 2 for Micro-Estimates of Wealth for all Low- and Middle-Income Countries

Figure 3 for Micro-Estimates of Wealth for all Low- and Middle-Income Countries

Figure 4 for Micro-Estimates of Wealth for all Low- and Middle-Income Countries

Abstract:Many critical policy decisions, from strategic investments to the allocation of humanitarian aid, rely on data about the geographic distribution of wealth and poverty. Yet many poverty maps are out of date or exist only at very coarse levels of granularity. Here we develop the first micro-estimates of wealth and poverty that cover the populated surface of all 135 low and middle-income countries (LMICs) at 2.4km resolution. The estimates are built by applying machine learning algorithms to vast and heterogeneous data from satellites, mobile phone networks, topographic maps, as well as aggregated and de-identified connectivity data from Facebook. We train and calibrate the estimates using nationally-representative household survey data from 56 LMICs, then validate their accuracy using four independent sources of household survey data from 18 countries. We also provide confidence intervals for each micro-estimate to facilitate responsible downstream use. These estimates are provided free for public use in the hope that they enable targeted policy response to the COVID-19 pandemic, provide the foundation for new insights into the causes and consequences of economic development and growth, and promote responsible policymaking in support of the Sustainable Development Goals.

Via

Access Paper or Ask Questions

Spectral clustering and the high-dimensional stochastic blockmodel

Dec 13, 2011

Karl Rohe, Sourav Chatterjee, Bin Yu

Figure 1 for Spectral clustering and the high-dimensional stochastic blockmodel

Figure 2 for Spectral clustering and the high-dimensional stochastic blockmodel

Figure 3 for Spectral clustering and the high-dimensional stochastic blockmodel

Figure 4 for Spectral clustering and the high-dimensional stochastic blockmodel

Abstract:Networks or graphs can easily represent a diverse set of data sources that are characterized by interacting units or actors. Social networks, representing people who communicate with each other, are one example. Communities or clusters of highly connected actors form an essential feature in the structure of several empirical networks. Spectral clustering is a popular and computationally feasible method to discover these communities. The stochastic blockmodel [Social Networks 5 (1983) 109--137] is a social network model with well-defined communities; each node is a member of one community. For a network generated from the Stochastic Blockmodel, we bound the number of nodes "misclustered" by spectral clustering. The asymptotic results in this paper are the first clustering results that allow the number of clusters in the model to grow with the number of nodes, hence the name high-dimensional. In order to study spectral clustering under the stochastic blockmodel, we first show that under the more general latent space model, the eigenvectors of the normalized graph Laplacian asymptotically converge to the eigenvectors of a "population" normalized graph Laplacian. Aside from the implication for spectral clustering, this provides insight into a graph visualization technique. Our method of studying the eigenvectors of random matrices is original.

* Annals of Statistics 2011, Vol. 39, No. 4, 1878-1915
* Published in at http://dx.doi.org/10.1214/11-AOS887 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Via

Access Paper or Ask Questions