Get our free extension to see links to code for papers anywhere online!Free extension: code links for papers anywhere!Free add-on: See code for papers anywhere!

Sinho Chewi, Jaume de Dios Pont, Jerry Li, Chen Lu, Shyam Narayanan

Log-concave sampling has witnessed remarkable algorithmic advances in recent years, but the corresponding problem of proving lower bounds for this task has remained elusive, with lower bounds previously known only in dimension one. In this work, we establish the following query lower bounds: (1) sampling from strongly log-concave and log-smooth distributions in dimension $d\ge 2$ requires $\Omega(\log \kappa)$ queries, which is sharp in any constant dimension, and (2) sampling from Gaussians in dimension $d$ (hence also from general log-concave and log-smooth distributions in dimension $d$) requires $\widetilde \Omega(\min(\sqrt\kappa \log d, d))$ queries, which is nearly sharp for the class of Gaussians. Here $\kappa$ denotes the condition number of the target distribution. Our proofs rely upon (1) a multiscale construction inspired by work on the Kakeya conjecture in harmonic analysis, and (2) a novel reduction that demonstrates that block Krylov algorithms are optimal for this problem, as well as connections to lower bound techniques based on Wishart matrices developed in the matrix-vector query literature.

Via

Sinho Chewi, Patrik Gerber, Holden Lee, Chen Lu

We prove two lower bounds for the complexity of non-log-concave sampling within the framework of Balasubramanian et al. (2022), who introduced the use of Fisher information (FI) bounds as a notion of approximate first-order stationarity in sampling. Our first lower bound shows that averaged LMC is optimal for the regime of large FI by reducing the problem of finding stationary points in non-convex optimization to sampling. Our second lower bound shows that in the regime of small FI, obtaining a FI of at most $\varepsilon^2$ from the target distribution requires $\text{poly}(1/\varepsilon)$ queries, which is surprising as it rules out the existence of high-accuracy algorithms (e.g., algorithms using Metropolis-Hastings filters) in this context.

Via

Sinho Chewi, Patrik Gerber, Chen Lu, Thibaut Le Gouic, Philippe Rigollet

We establish the first tight lower bound of $\Omega(\log\log\kappa)$ on the query complexity of sampling from the class of strongly log-concave and log-smooth distributions with condition number $\kappa$ in one dimension. Whereas existing guarantees for MCMC-based algorithms scale polynomially in $\kappa$, we introduce a novel algorithm based on rejection sampling that closes this doubly exponential gap.

Via

Sinho Chewi, Patrik Gerber, Chen Lu, Thibaut Le Gouic, Philippe Rigollet

We consider the task of generating exact samples from a target distribution, known up to normalization, over a finite alphabet. The classical algorithm for this task is rejection sampling, and although it has been used in practice for decades, there is surprisingly little study of its fundamental limitations. In this work, we study the query complexity of rejection sampling in a minimax framework for various classes of discrete distributions. Our results provide new algorithms for sampling whose complexity scales sublinearly with the alphabet size. When applied to adversarial bandits, we show that a slight modification of the Exp3 algorithm reduces the per-iteration complexity from $\mathcal O(K)$ to $\mathcal O(\log^2 K)$, where $K$ is the number of arms.

Via

Guanqun Cao, Jiaqi Jiang, Chen Lu, Daniel Fernandes Gomes, Shan Luo

Tactile sensing is important for robots to perceive the world as it captures the texture and hardness of the object in contact and is robust to illumination and colour variances. However, due to the limited sensing area and the resistance of the fixed surface, current tactile sensors have to tap the tactile sensor on target object many times when assessing a large surface, i.e., pressing, lifting up and shifting to another region. This process is ineffective and time consuming. It is also undesirable to drag such sensors as this often damages the sensitive membrane of the sensor or the object. To address these problems, we propose a cylindrical optical tactile sensor named TouchRoller that can roll around its center axis. It maintains being in contact with the assessed surface throughout the entire motion, which allows for measuring the object continuously and effectively. Extensive experiments show that the TouchRoller sensor can cover a textured surface of 8cm*11cm in a short time of 10s, much more effectively than a flat optical tactile sensor (in 196s). The reconstructed map of the texture from the collected tactile images has a high Structural Similarity Index (SSIM) of 0.31 on average, when compared with the visual texture. In addition, the contacts on the sensor can be localised with a low localisation error, 2.63mm in the center regions and 7.66mm on average. The proposed sensor will enable the fast assessment of large surfaces with high-resolution tactile sensing, and also the effective collection of tactile images.

Via

Sinho Chewi, Chen Lu, Kwangjun Ahn, Xiang Cheng, Thibaut Le Gouic, Philippe Rigollet

Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as $O(d^{1/3})$, where $d$ is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixing time bound for MALA on the class of log-smooth and strongly log-concave distributions is $O(d)$. In this work, we establish that the mixing time of MALA on this class of target distributions is $\widetilde\Theta(d^{1/2})$ under a warm start. Our upper bound proof introduces a new technique based on a projection characterization of the Metropolis adjustment which reduces the study of MALA to the well-studied discretization analysis of the Langevin SDE and bypasses direct computation of the acceptance probability.

Via

Chen Lu, Subhabrata Sen

We study community detection in the contextual stochastic block model arXiv:1807.09596 [cs.SI], arXiv:1607.02675 [stat.ME]. In arXiv:1807.09596 [cs.SI], the second author studied this problem in the setting of sparse graphs with high-dimensional node-covariates. Using the non-rigorous cavity method from statistical physics, they conjectured the sharp limits for community detection in this setting. Further, the information theoretic threshold was verified, assuming that the average degree of the observed graph is large. It is expected that the conjecture holds as soon as the average degree exceeds one, so that the graph has a giant component. We establish this conjecture, and characterize the sharp threshold for detection and weak recovery.

Via

Sinho Chewi, Thibaut Le Gouic, Chen Lu, Tyler Maunu, Philippe Rigollet

Stein Variational Gradient Descent (SVGD), a popular sampling algorithm, is often described as the kernelized gradient flow for the Kullback-Leibler divergence in the geometry of optimal transport. We introduce a new perspective on SVGD that instead views SVGD as the (kernelized) gradient flow of the chi-squared divergence which, we show, exhibits a strong form of uniform exponential ergodicity under conditions as weak as a Poincar\'e inequality. This perspective leads us to propose an alternative to SVGD, called Laplacian Adjusted Wasserstein Gradient Descent (LAWGD), that can be implemented from the spectral decomposition of the Laplacian operator associated with the target density. We show that LAWGD exhibits strong convergence guarantees and good practical performance.

Via

Sinho Chewi, Thibaut Le Gouic, Chen Lu, Tyler Maunu, Philippe Rigollet, Austin J. Stromme

Motivated by the problem of sampling from ill-conditioned log-concave distributions, we give a clean non-asymptotic convergence analysis of mirror-Langevin diffusions as introduced in Zhang et al. (2020). As a special case of this framework, we propose a class of diffusions called Newton-Langevin diffusions and prove that they converge to stationarity exponentially fast with a rate which not only is dimension-free, but also has no dependence on the target distribution. We give an application of this result to the problem of sampling from the uniform distribution on a convex body using a strategy inspired by interior-point methods. Our general approach follows the recent trend of linking sampling and optimization and highlights the role of the chi-squared divergence. In particular, it yields new results on the convergence of the vanilla Langevin diffusion in Wasserstein distance.

Via