Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chris McKennan

Simultaneous estimation of connectivity and dimensionality in samples of networks

Aug 17, 2025

Wenlong Jiang, Chris McKennan, Jesús Arroyo, Joshua Cape

Abstract:An overarching objective in contemporary statistical network analysis is extracting salient information from datasets consisting of multiple networks. To date, considerable attention has been devoted to node and network clustering, while comparatively less attention has been devoted to downstream connectivity estimation and parsimonious embedding dimension selection. Given a sample of potentially heterogeneous networks, this paper proposes a method to simultaneously estimate a latent matrix of connectivity probabilities and its embedding dimensionality or rank after first pre-estimating the number of communities and the node community memberships. The method is formulated as a convex optimization problem and solved using an alternating direction method of multipliers algorithm. We establish estimation error bounds under the Frobenius norm and nuclear norm for settings in which observable networks have blockmodel structure, even when node memberships are imperfectly recovered. When perfect membership recovery is possible and dimensionality is much smaller than the number of communities, the proposed method outperforms conventional averaging-based methods for estimating connectivity and dimensionality. Numerical studies empirically demonstrate the accuracy of our method across various scenarios. Additionally, analysis of a primate brain dataset demonstrates that posited connectivity is not necessarily full rank in practice, illustrating the need for flexible methodology.

* Main text: 35 pages, 5 figures, 5 tables. Supplement: 26 pages, 3 figures, 5 tables

Via

Access Paper or Ask Questions

A statistical framework for GWAS of high dimensional phenotypes using summary statistics, with application to metabolite GWAS

Mar 17, 2023

Weiqiong Huang, Emily C. Hector, Joshua Cape, Chris McKennan

Figure 1 for A statistical framework for GWAS of high dimensional phenotypes using summary statistics, with application to metabolite GWAS

Figure 2 for A statistical framework for GWAS of high dimensional phenotypes using summary statistics, with application to metabolite GWAS

Figure 3 for A statistical framework for GWAS of high dimensional phenotypes using summary statistics, with application to metabolite GWAS

Figure 4 for A statistical framework for GWAS of high dimensional phenotypes using summary statistics, with application to metabolite GWAS

Abstract:The recent explosion of genetic and high dimensional biobank and 'omic' data has provided researchers with the opportunity to investigate the shared genetic origin (pleiotropy) of hundreds to thousands of related phenotypes. However, existing methods for multi-phenotype genome-wide association studies (GWAS) do not model pleiotropy, are only applicable to a small number of phenotypes, or provide no way to perform inference. To add further complication, raw genetic and phenotype data are rarely observed, meaning analyses must be performed on GWAS summary statistics whose statistical properties in high dimensions are poorly understood. We therefore developed a novel model, theoretical framework, and set of methods to perform Bayesian inference in GWAS of high dimensional phenotypes using summary statistics that explicitly model pleiotropy, beget fast computation, and facilitate the use of biologically informed priors. We demonstrate the utility of our procedure by applying it to metabolite GWAS, where we develop new nonparametric priors for genetic effects on metabolite levels that use known metabolic pathway information and foster interpretable inference at the pathway level.

* 24 pages of main text, 7 figures, 1 table

Via

Access Paper or Ask Questions