We study the computational complexity of some explainable clustering problems in the framework proposed by [Dasgupta et al., ICML 2020], where explainability is achieved via axis-aligned decision trees. We consider the $k$-means, $k$-medians, $k$-centers and the spacing cost functions. We prove that the first three are hard to optimize while the latter can be optimized in polynomial time.
The Gini impurity is one of the measures used to select attribute in Decision Trees/Random Forest construction. In this note we discuss connections between the problem of computing the partition with minimum Weighted Gini impurity and the $k$-means clustering problem. Based on these connections we show that the computation of the partition with minimum Weighted Gini is a NP-Complete problem and we also discuss how to obtain new algorithms with provable approximation for the Gini Minimization problem.