Hannaneh
Abstract:We present a smooth probabilistic reformulation of $\ell_0$ regularized regression that does not require Monte Carlo sampling and allows for the computation of exact gradients, facilitating rapid convergence to local optima of the best subset selection problem. The method drastically improves convergence speed compared to similar Monte Carlo based approaches. Furthermore, we empirically demonstrate that it outperforms compressive sensing algorithms such as IHT and (Relaxed-) Lasso across a wide range of settings and signal-to-noise ratios. The implementation runs efficiently on both CPUs and GPUs and is freely available at https://github.com/L0-and-behold/probabilistic-nonlinear-cs. We also contribute to research on nonlinear generalizations of compressive sensing by investigating when parameter recovery of a nonlinear teacher network is possible through compression of a student network. Building upon theorems of Fefferman and Markel, we show theoretically that the global optimum in the infinite-data limit enforces recovery up to certain symmetries. For empirical validation, we implement a normal-form algorithm that selects a canonical representative within each symmetry class. However, while compression can help to improve test loss, we find that exact parameter recovery is not even possible up to symmetries. In particular, we observe a surprising rebound effect where teacher and student configurations initially converge but subsequently diverge despite continuous decrease in test loss. These findings indicate fundamental differences between linear and nonlinear compressive sensing.
Abstract:We compare, improve, and contribute methods that substantially decrease the number of parameters of neural networks while maintaining high test accuracy. When applying our methods to minimize description length, we obtain very effective data compression algorithms. In particular, we develop a probabilistic reformulation of $\ell_0$ regularized optimization for nonlinear models that does not require Monte-Carlo sampling and thus improves upon previous methods. We also improve upon methods involving smooth approximations to the $\ell_0$ norm, and investigate layerwise methods. We compare the methods on different architectures and datasets, including convolutional networks trained on image datasets and transformers trained on parts of Wikipedia. We also created a synthetic teacher-student setup to investigate compression in a controlled continuous setting. Finally, we conceptually relate compression algorithms to Solomonoff's theory of inductive inference and empirically verify the prediction that regularized models can exhibit more sample-efficient convergence.
Abstract:Many machine learning algorithms try to visualize high dimensional metric data in 2D in such a way that the essential geometric and topological features of the data are highlighted. In this paper, we introduce a framework for aggregating dissimilarity functions that arise from locally adjusting a metric through density-aware normalization, as employed in the IsUMap method. We formalize these approaches as m-schemes, a class of methods closely related to t-norms and t-conorms in probabilistic metrics, as well as to composition laws in information theory. These m-schemes provide a flexible and theoretically grounded approach to refining distance-based embeddings.
Abstract:This work introduces IsUMap, a novel manifold learning technique that enhances data representation by integrating aspects of UMAP and Isomap with Vietoris-Rips filtrations. We present a systematic and detailed construction of a metric representation for locally distorted metric spaces that captures complex data structures more accurately than the previous schemes. Our approach addresses limitations in existing methods by accommodating non-uniform data distributions and intricate local geometries. We validate its performance through extensive experiments on examples of various geometric objects and benchmark real-world datasets, demonstrating significant improvements in representation quality.