Abstract:The optimal transport problem with squared Euclidean cost consists in finding a coupling between two input measures that maximizes correlation. Consequently, the optimal coupling is often singular with respect to Lebesgue measure. Regularizing the optimal transport problem with an entropy term yields an approximation called entropic optimal transport. Entropic penalties steer the induced coupling toward a reference measure with desired properties. For instance, when seeking a diffuse coupling, the most popular reference measures are the Lebesgue measure and the product of the two input measures. In this work, we study the case where the reference coupling is not necessarily assumed to be a product. We focus on the Gaussian case as a motivating paradigm, and provide a reduction of this more general optimal transport criterion to a matrix optimization problem. This reduction enables us to provide a complete description of the solution, both in terms of the primal variable and the dual variables. We argue that flexibility in terms of the reference measure can be important in statistical contexts, for instance when one has prior information, when there is uncertainty regarding the measures to be coupled, or to reduce bias when the entropic problem is used to estimate the un-regularized transport problem. In particular, we show in numerical examples that choosing a suitable reference plan allows to reduce the bias caused by the entropic penalty.
Abstract:We introduce a novel statistical framework for the analysis of replicated point processes that allows for the study of point pattern variability at a population level. By treating point process realizations as random measures, we adopt a functional analysis perspective and propose a form of functional Principal Component Analysis (fPCA) for point processes. The originality of our method is to base our analysis on the cumulative mass functions of the random measures which gives us a direct and interpretable analysis. Key theoretical contributions include establishing a Karhunen-Lo\`{e}ve expansion for the random measures and a Mercer Theorem for covariance measures. We establish convergence in a strong sense, and introduce the concept of principal measures, which can be seen as latent processes governing the dynamics of the observed point patterns. We propose an easy-to-implement estimation strategy of eigenelements for which parametric rates are achieved. We fully characterize the solutions of our approach to Poisson and Hawkes processes and validate our methodology via simulations and diverse applications in seismology, single-cell biology and neurosiences, demonstrating its versatility and effectiveness. Our method is implemented in the pppca R-package.