Abstract:A celebrated result of Pollard proves asymptotic consistency for $k$-means clustering when the population distribution has finite variance. In this work, we point out that the population-level $k$-means clustering problem is, in fact, well-posed under the weaker assumption of a finite expectation, and we investigate whether some form of asymptotic consistency holds in this setting. As we illustrate in a variety of negative results, the complete story is quite subtle; for example, the empirical $k$-means cluster centers may fail to converge even if there exists a unique set of population $k$-means cluster centers. A detailed analysis of our negative results reveals that inconsistency arises because of an extreme form of cluster imbalance, whereby the presence of outlying samples leads to some empirical $k$-means clusters possessing very few points. We then give a collection of positive results which show that some forms of asymptotic consistency, under only the assumption of finite expectation, may be recovered by imposing some a priori degree of balance among the empirical $k$-means clusters.
Abstract:We introduce a class of clustering procedures which includes $k$-means and $k$-medians, as well as variants of these where the domain of the cluster centers can be chosen adaptively (for example, $k$-medoids) and where the number of cluster centers can be chosen adaptively (for example, according to the elbow method). In the non-parametric setting and assuming only the finiteness of certain moments, we show that all clustering procedures in this class are strongly consistent under IID samples. Our method of proof is to directly study the continuity of various deterministic maps associated with these clustering procedures, and to show that strong consistency simply descends from analogous strong consistency of the empirical measures. In the adaptive setting, our work provides a strong consistency result that is the first of its kind. In the non-adaptive setting, our work strengthens Pollard's classical result by dispensing with various unnecessary technical hypotheses, by upgrading the particular notion of strong consistency, and by using the same methods to prove further limit theorems.