Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rayna Andreeva

Neural collapse in the orthoplex regime

Mar 21, 2026

James Alcala, Rayna Andreeva, Vladimir A. Kobzar, Dustin G. Mixon, Sanghoon Na, Shashank Sule, Yangxinyu Xie

Abstract:When training a neural network for classification, the feature vectors of the training set are known to collapse to the vertices of a regular simplex, provided the dimension $d$ of the feature space and the number $n$ of classes satisfies $n\leq d+1$. This phenomenon is known as neural collapse. For other applications like language models, one instead takes $n\gg d$. Here, the neural collapse phenomenon still occurs, but with different emergent geometric figures. We characterize these geometric figures in the orthoplex regime where $d+2\leq n\leq 2d$. The techniques in our analysis primarily involve Radon's theorem and convexity.

Via

Access Paper or Ask Questions

Approximating Metric Magnitude of Point Sets

Sep 06, 2024

Rayna Andreeva, James Ward, Primoz Skraba, Jie Gao, Rik Sarkar

Figure 1 for Approximating Metric Magnitude of Point Sets

Figure 2 for Approximating Metric Magnitude of Point Sets

Figure 3 for Approximating Metric Magnitude of Point Sets

Figure 4 for Approximating Metric Magnitude of Point Sets

Abstract:Metric magnitude is a measure of the "size" of point clouds with many desirable geometric properties. It has been adapted to various mathematical contexts and recent work suggests that it can enhance machine learning and optimization algorithms. But its usability is limited due to the computational cost when the dataset is large or when the computation must be carried out repeatedly (e.g. in model training). In this paper, we study the magnitude computation problem, and show efficient ways of approximating it. We show that it can be cast as a convex optimization problem, but not as a submodular optimization. The paper describes two new algorithms - an iterative approximation algorithm that converges fast and is accurate, and a subset selection method that makes the computation even faster. It has been previously proposed that magnitude of model sequences generated during stochastic gradient descent is correlated to generalization gap. Extension of this result using our more scalable algorithms shows that longer sequences in fact bear higher correlations. We also describe new applications of magnitude in machine learning - as an effective regularizer for neural network training, and as a novel clustering criterion.

Via

Access Paper or Ask Questions

Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms

Jul 11, 2024

Rayna Andreeva, Benjamin Dupuis, Rik Sarkar, Tolga Birdal, Umut Şimşekli

Figure 1 for Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms

Figure 2 for Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms

Figure 3 for Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms

Figure 4 for Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms

Abstract:We present a novel set of rigorous and computationally efficient topology-based complexity notions that exhibit a strong correlation with the generalization gap in modern deep neural networks (DNNs). DNNs show remarkable generalization properties, yet the source of these capabilities remains elusive, defying the established statistical learning theory. Recent studies have revealed that properties of training trajectories can be indicative of generalization. Building on this insight, state-of-the-art methods have leveraged the topology of these trajectories, particularly their fractal dimension, to quantify generalization. Most existing works compute this quantity by assuming continuous- or infinite-time training dynamics, complicating the development of practical estimators capable of accurately predicting generalization without access to test data. In this paper, we respect the discrete-time nature of training trajectories and investigate the underlying topological quantities that can be amenable to topological data analysis tools. This leads to a new family of reliable topological complexity measures that provably bound the generalization error, eliminating the need for restrictive geometric assumptions. These measures are computationally friendly, enabling us to propose simple yet effective algorithms for computing generalization indices. Moreover, our flexible framework can be extended to different domains, tasks, and architectures. Our experimental results demonstrate that our new complexity measures correlate highly with generalization error in industry-standards architectures such as transformers and deep graph networks. Our approach consistently outperforms existing topological bounds across a wide range of datasets, models, and optimizers, highlighting the practical relevance and effectiveness of our complexity measures.

Via

Access Paper or Ask Questions

Metric Space Magnitude for Evaluating Unsupervised Representation Learning

Nov 27, 2023

Katharina Limbeck, Rayna Andreeva, Rik Sarkar, Bastian Rieck

Figure 1 for Metric Space Magnitude for Evaluating Unsupervised Representation Learning

Figure 2 for Metric Space Magnitude for Evaluating Unsupervised Representation Learning

Figure 3 for Metric Space Magnitude for Evaluating Unsupervised Representation Learning

Figure 4 for Metric Space Magnitude for Evaluating Unsupervised Representation Learning

Abstract:The magnitude of a metric space was recently established as a novel invariant, providing a measure of the `effective size' of a space across multiple scales. By capturing both geometrical and topological properties of data, magnitude is poised to address challenges in unsupervised representation learning tasks. We formalise a novel notion of dissimilarity between magnitude functions of finite metric spaces and use them to derive a quality measure for dimensionality reduction tasks. Our measure is provably stable under perturbations of the data, can be efficiently calculated, and enables a rigorous multi-scale comparison of embeddings. We show the utility of our measure in an experimental suite that comprises different domains and tasks, including the comparison of data visualisations.

Via

Access Paper or Ask Questions

Accelerated Shapley Value Approximation for Data Evaluation

Nov 09, 2023

Lauren Watson, Zeno Kujawa, Rayna Andreeva, Hao-Tsung Yang, Tariq Elahi, Rik Sarkar

Figure 1 for Accelerated Shapley Value Approximation for Data Evaluation

Figure 2 for Accelerated Shapley Value Approximation for Data Evaluation

Figure 3 for Accelerated Shapley Value Approximation for Data Evaluation

Abstract:Data valuation has found various applications in machine learning, such as data filtering, efficient learning and incentives for data sharing. The most popular current approach to data valuation is the Shapley value. While popular for its various applications, Shapley value is computationally expensive even to approximate, as it requires repeated iterations of training models on different subsets of data. In this paper we show that the Shapley value of data points can be approximated more efficiently by leveraging the structural properties of machine learning problems. We derive convergence guarantees on the accuracy of the approximate Shapley value for different learning settings including Stochastic Gradient Descent with convex and non-convex loss functions. Our analysis suggests that in fact models trained on small subsets are more important in the context of data valuation. Based on this idea, we describe $\delta$-Shapley -- a strategy of only using small subsets for the approximation. Experiments show that this approach preserves approximate value and rank of data, while achieving speedup of up to 9.9x. In pre-trained networks the approach is found to bring more efficiency in terms of accurate evaluation using small subsets.

Via

Access Paper or Ask Questions

Machine learning and Topological data analysis identify unique features of human papillae in 3D scans

Jul 12, 2023

Rayna Andreeva, Anwesha Sarkar, Rik Sarkar

Abstract:The tongue surface houses a range of papillae that are integral to the mechanics and chemistry of taste and textural sensation. Although gustatory function of papillae is well investigated, the uniqueness of papillae within and across individuals remains elusive. Here, we present the first machine learning framework on 3D microscopic scans of human papillae (n = 2092), uncovering the uniqueness of geometric and topological features of papillae. The finer differences in shapes of papillae are investigated computationally based on a number of features derived from discrete differential geometry and computational topology. Interpretable machine learning techniques show that persistent homology features of the papillae shape are the most effective in predicting the biological variables. Models trained on these features with small volumes of data samples predict the type of papillae with an accuracy of 85%. The papillae type classification models can map the spatial arrangement of filiform and fungiform papillae on a surface. Remarkably, the papillae are found to be distinctive across individuals and an individual can be identified with an accuracy of 48% among the 15 participants from a single papillae. Collectively, this is the first unprecedented evidence demonstrating that tongue papillae can serve as a unique identifier inspiring new research direction for food preferences and oral diagnostics.

Via

Access Paper or Ask Questions

Metric Space Magnitude and Generalisation in Neural Networks

May 09, 2023

Rayna Andreeva, Katharina Limbeck, Bastian Rieck, Rik Sarkar

Figure 1 for Metric Space Magnitude and Generalisation in Neural Networks

Figure 2 for Metric Space Magnitude and Generalisation in Neural Networks

Figure 3 for Metric Space Magnitude and Generalisation in Neural Networks

Figure 4 for Metric Space Magnitude and Generalisation in Neural Networks

Abstract:Deep learning models have seen significant successes in numerous applications, but their inner workings remain elusive. The purpose of this work is to quantify the learning process of deep neural networks through the lens of a novel topological invariant called magnitude. Magnitude is an isometry invariant; its properties are an active area of research as it encodes many known invariants of a metric space. We use magnitude to study the internal representations of neural networks and propose a new method for determining their generalisation capabilities. Moreover, we theoretically connect magnitude dimension and the generalisation error, and demonstrate experimentally that the proposed framework can be a good indicator of the latter.

Via

Access Paper or Ask Questions

Differentially Private Shapley Values for Data Evaluation

Jun 01, 2022

Lauren Watson, Rayna Andreeva, Hao-Tsung Yang, Rik Sarkar

Figure 1 for Differentially Private Shapley Values for Data Evaluation

Abstract:The Shapley value has been proposed as a solution to many applications in machine learning, including for equitable valuation of data. Shapley values are computationally expensive and involve the entire dataset. The query for a point's Shapley value can also compromise the statistical privacy of other data points. We observe that in machine learning problems such as empirical risk minimization, and in many learning algorithms (such as those with uniform stability), a diminishing returns property holds, where marginal benefit per data point decreases rapidly with data sample size. Based on this property, we propose a new stratified approximation method called the Layered Shapley Algorithm. We prove that this method operates on small (O(\polylog(n))) random samples of data and small sized ($O(\log n)$) coalitions to achieve the results with guaranteed probabilistic accuracy, and can be modified to incorporate differential privacy. Experimental results show that the algorithm correctly identifies high-value data points that improve validation accuracy, and that the differentially private evaluations preserve approximate ranking of data.

Via

Access Paper or Ask Questions