Abstract:Decision trees are prized for their interpretability and strong performance on tabular data. Yet, their reliance on simple axis-aligned linear splits often forces deep, complex structures to capture non-linear feature effects, undermining human comprehension of the constructed tree. To address this limitation, we propose a novel generalization of a decision tree, the Shape Generalized Tree (SGT), in which each internal node applies a learnable axis-aligned shape function to a single feature, enabling rich, non-linear partitioning in one split. As users can easily visualize each node's shape function, SGTs are inherently interpretable and provide intuitive, visual explanations of the model's decision mechanisms. To learn SGTs from data, we propose ShapeCART, an efficient induction algorithm for SGTs. We further extend the SGT framework to bivariate shape functions (S$^2$GT) and multi-way trees (SGT$_K$), and present Shape$^2$CART and ShapeCART$_K$, extensions to ShapeCART for learning S$^2$GTs and SGT$_K$s, respectively. Experiments on various datasets show that SGTs achieve superior performance with reduced model size compared to traditional axis-aligned linear trees.
Abstract:Interpretable clustering algorithms aim to group similar data points while explaining the obtained groups to support knowledge discovery and pattern recognition tasks. While most approaches to interpretable clustering construct clusters using decision trees, the interpretability of trees often deteriorates on complex problems where large trees are required. In this work, we introduce the Neural Clustering Additive Model (NeurCAM), a novel approach to the interpretable clustering problem that leverages neural generalized additive models to provide fuzzy cluster membership with additive explanations of the obtained clusters. To promote sparsity in our model's explanations, we introduce selection gates that explicitly limit the number of features and pairwise interactions leveraged. Additionally, we demonstrate the capacity of our model to perform text clustering that considers the contextual representation of the texts while providing explanations for the obtained clusters based on uni- or bi-word terms. Extensive experiments show that NeurCAM achieves performance comparable to black-box methods on tabular datasets while remaining interpretable. Additionally, our approach significantly outperforms other interpretable clustering approaches when clustering on text data.