Abstract:We statistically analyze empirical plug-in estimators for unbalanced optimal transport (UOT) formalisms, focusing on the Kantorovich-Rubinstein distance, between general intensity measures based on observations from spatio-temporal point processes. Specifically, we model the observations by two weakly time-stationary point processes with spatial intensity measures $\mu$ and $\nu$ over the expanding window $(0,t]$ as $t$ increases to infinity, and establish sharp convergence rates of the empirical UOT in terms of the intrinsic dimensions of the measures. We assume a sub-quadratic temporal growth condition of the variance of the process, which allows for a wide range of temporal dependencies. As the growth approaches quadratic, the convergence rate becomes slower. This variance assumption is related to the time-reduced factorial covariance measure, and we exemplify its validity for various point processes, including the Poisson cluster, Hawkes, Neyman-Scott, and log-Gaussian Cox processes. Complementary to our upper bounds, we also derive matching lower bounds for various spatio-temporal point processes of interest and establish near minimax rate optimality of the empirical Kantorovich-Rubinstein distance.
Abstract:We analyze the statistical problem of recovering an atomic signal, modeled as a discrete uniform distribution $\mu$, from a binned Poisson convolution model. This question is motivated, among others, by super-resolution laser microscopy applications, where precise estimation of $\mu$ provides insights into spatial formations of cellular protein assemblies. Our main results quantify the local minimax risk of estimating $\mu$ for a broad class of smooth convolution kernels. This local perspective enables us to sharply quantify optimal estimation rates as a function of the clustering structure of the underlying signal. Moreover, our results are expressed under a multiscale loss function, which reveals that different parts of the underlying signal can be recovered at different rates depending on their local geometry. Overall, these results paint an optimistic perspective on the Poisson deconvolution problem, showing that accurate recovery is achievable under a much broader class of signals than suggested by existing global minimax analyses. Beyond Poisson deconvolution, our results also allow us to establish the local minimax rate of parameter estimation in Gaussian mixture models with uniform weights. We apply our methods to experimental super-resolution microscopy data to identify the location and configuration of individual DNA origamis. In addition, we complement our findings with numerical experiments on runtime and statistical recovery that showcase the practical performance of our estimators and their trade-offs.
Abstract:Entropic optimal transport (EOT) presents an effective and computationally viable alternative to unregularized optimal transport (OT), offering diverse applications for large-scale data analysis. In this work, we derive novel statistical bounds for empirical plug-in estimators of the EOT cost and show that their statistical performance in the entropy regularization parameter $\epsilon$ and the sample size $n$ only depends on the simpler of the two probability measures. For instance, under sufficiently smooth costs this yields the parametric rate $n^{-1/2}$ with factor $\epsilon^{-d/2}$, where $d$ is the minimum dimension of the two population measures. This confirms that empirical EOT also adheres to the lower complexity adaptation principle, a hallmark feature only recently identified for unregularized OT. As a consequence of our theory, we show that the empirical entropic Gromov-Wasserstein distance and its unregularized version for measures on Euclidean spaces also obey this principle. Additionally, we comment on computational aspects and complement our findings with Monte Carlo simulations. Our techniques employ empirical process theory and rely on a dual formulation of EOT over a single function class. Crucial to our analysis is the observation that the entropic cost-transformation of a function class does not increase its uniform metric entropy by much.