Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gonzalo G. de Polavieja

Algebraic Machine Learning for Small-to-Medium Datasets Is Competitive against Strong Standard Baselines

May 21, 2026

David Mendez, Fernando Martin-Maroto, Gonzalo G. de Polavieja

Abstract:Symbolic methods are generally not considered competitive with strong modern learners on realistic supervised tasks. We evaluate Algebraic Machine Learning (AML), a framework that learns through subdirect decomposition of algebraic structure rather than numerical optimization, against standard baselines on image and tabular classification across varying training-set sizes. We find that AML trained only on training data without using validation or cross-validation outperforms a family of cross-validated baseline methods including CNNs on small to medium image datasets (50--2000 training examples). On tabular datasets in the same size range, XGBoost is overall the best performing method, but AML is nonetheless comparable to methods incorporating task-specific biases such as LightGBM and random forests. AML achieves this competitive performance across two very different types of datasets using a generic algebraic inductive bias, rather than the modality-specific biases built into standard baselines like CNNs for images or XGBoost for tabular data, and requires no cross validation because it has no task-dependent hyperparameters to tune.

* 9 pages, 4 figures

Via

Access Paper or Ask Questions

Beyond ECE: Calibrated Size Ratio, Risk Assessment, and Confidence-Weighted Metrics

May 05, 2026

Fernando Martin-Maroto, Nabil Abderrahaman, Gonzalo G. de Polavieja

Abstract:Confidence calibration has been dominated by the Expected Calibration Error (ECE), a linear metric that counts calibration offset equally regardless of the confidence level at which it occurs. We show that ECE can remain small even under arbitrarily large overconfidence risk, so we propose Calibrated Size Ratio (CSR) instead, an interpretable metric that equals 1 under perfect calibration, from which we derive the risk probability $P_{\mathrm{risk}}$ that quantifies the statistical evidence for overconfidence. We further argue that overconfidence risk assessment must be complemented by a measure of discriminative value: whether the assigned confidences actively distinguish correct from incorrect predictions. We show that confidence-weighted accuracy $\mathrm{cwA}$ is the natural such complement, and that confidence-weighting extends to all standard classification metrics. In particular, we prove that the confidence-weighted AUC (cwAUC) captures the information about calibration while the classical AUC cannot. We validate the proposed indicators on several synthetic confidence distributions under multiple controlled calibration profiles and find that CSR separates risky from non-risky assignments. We also test the metrics on fifteen real datasets, with and without post-hoc calibration, and find that standard methods can yield risky confidence profiles.

Via

Access Paper or Ask Questions

Integrative neurocybernetic modeling in the era of large-scale neuroscience

Apr 26, 2026

Il Memming Park, Ayesha Vermani, Gonzalo G. de Polavieja, Juan Álvaro Gallego, Kathleen Esfahany, Shreya Saxena, Michael Orger, Auke Ijspeert, Matthew Dowling, Daniel McNamee(+4 more)

Abstract:Large-scale neuroscience is generating rich datasets across animals, brain areas and behavioral contexts, yet our modeling efforts remains fragmented across isolated experiments. We argue that understanding behavior requires integrative neurocybernetic models: understandable dynamical models that capture the closed-loop coupling of brain, body and environment, treat the brain as a controller pursuing latent objectives, represent structured variation across scales, and scale to heterogeneous datasets. Such models shift the goal from predicting neural recordings in isolation to inferring the organizing principles that govern neural and behavioral dynamics. We outline a practical route toward this goal by combining nonlinear state-space models and meta-dynamical extensions with scalable inference, knowledge distillation, mixed open- and closed-loop training, and connectomics-informed architectures. By pooling complementary constraints from recordings, behavior, perturbations and anatomy, integrative neurocybernetic models can provide statistical amplification, few-shot generalization, and mechanistic insight into shared dynamical structure, individual variation, and the control objectives that govern behavior. This agenda offers a model-centric path from fragmented data to a mechanistic science of how brains produce behavior.

* Perspective

Via

Access Paper or Ask Questions

Algebraic Machine Learning: Learning as computing an algebraic decomposition of a task

Feb 27, 2025

Fernando Martin-Maroto, Nabil Abderrahaman, David Mendez, Gonzalo G. de Polavieja

Abstract:Statistics and Optimization are foundational to modern Machine Learning. Here, we propose an alternative foundation based on Abstract Algebra, with mathematics that facilitates the analysis of learning. In this approach, the goal of the task and the data are encoded as axioms of an algebra, and a model is obtained where only these axioms and their logical consequences hold. Although this is not a generalizing model, we show that selecting specific subsets of its breakdown into algebraic atoms obtained via subdirect decomposition gives a model that generalizes. We validate this new learning principle on standard datasets such as MNIST, FashionMNIST, CIFAR-10, and medical images, achieving performance comparable to optimized multilayer perceptrons. Beyond data-driven tasks, the new learning principle extends to formal problems, such as finding Hamiltonian cycles from their specifications and without relying on search. This algebraic foundation offers a fresh perspective on machine intelligence, featuring direct learning from training data without the need for validation dataset, scaling through model additivity, and asymptotic convergence to the underlying rule in the data.

Via

Access Paper or Ask Questions

Supervised dimensionality reduction by a Linear Discriminant Analysis on pre-trained CNN features

Jun 22, 2020

Francisco J. H. Heras, Gonzalo G. de Polavieja

Figure 1 for Supervised dimensionality reduction by a Linear Discriminant Analysis on pre-trained CNN features

Figure 2 for Supervised dimensionality reduction by a Linear Discriminant Analysis on pre-trained CNN features

Figure 3 for Supervised dimensionality reduction by a Linear Discriminant Analysis on pre-trained CNN features

Abstract:We explore the application of linear discriminant analysis (LDA) to the features obtained in different layers of pretrained deep convolutional neural networks (CNNs). The advantage of LDA compared to other techniques in dimensionality reduction is that it reduces dimensions while preserving the global structure of data, so distances in the low-dimensional structure found are meaningful. The LDA applied to the CNN features finds that the centroids of classes corresponding to the similar data lay closer than classes corresponding to different data. We applied the method to a modification of the MNIST dataset with ten additional classes, each new class with half of the images from one of the standard ten classes. The method finds the new classes close to the corresponding standard classes we took the data form. We also applied the method to a dataset of images of butterflies to find that related subspecies are found to be close. For both datasets, we find a performance similar to state-of-the-art methods.

Via

Access Paper or Ask Questions

Algebraic Machine Learning

Mar 15, 2018

Fernando Martin-Maroto, Gonzalo G. de Polavieja

Abstract:Machine learning algorithms use error function minimization to fit a large set of parameters in a preexisting model. However, error minimization eventually leads to a memorization of the training dataset, losing the ability to generalize to other datasets. To achieve generalization something else is needed, for example a regularization method or stopping the training when error in a validation dataset is minimal. Here we propose a different approach to learning and generalization that is parameter-free, fully discrete and that does not use function minimization. We use the training data to find an algebraic representation with minimal size and maximal freedom, explicitly expressed as a product of irreducible components. This algebraic representation is shown to directly generalize, giving high accuracy in test data, more so the smaller the representation. We prove that the number of generalizing representations can be very large and the algebra only needs to find one. We also derive and test a relationship between compression and error rate. We give results for a simple problem solved step by step, hand-written character recognition, and the Queens Completion problem as an example of unsupervised learning. As an alternative to statistical learning, algebraic learning may offer advantages in combining bottom-up and top-down information, formal concept derivation from data and large-scale parallelization.

* In v2 Figures 10 and 12 are images (v1 used latex commands), so all queens on board are now visible

Via

Access Paper or Ask Questions

idtracker.ai: Tracking all individuals in large collectives of unmarked animals

Mar 12, 2018

Francisco Romero-Ferrero, Mattia G. Bergomi, Robert Hinz, Francisco J. H. Heras, Gonzalo G. de Polavieja

Figure 1 for idtracker.ai: Tracking all individuals in large collectives of unmarked animals

Figure 2 for idtracker.ai: Tracking all individuals in large collectives of unmarked animals

Figure 3 for idtracker.ai: Tracking all individuals in large collectives of unmarked animals

Figure 4 for idtracker.ai: Tracking all individuals in large collectives of unmarked animals

Abstract:Our understanding of collective animal behavior is limited by our ability to track each of the individuals. We describe an algorithm and software, idtracker.ai, that extracts from video all trajectories with correct identities at a high accuracy for collectives of up to 100 individuals. It uses two deep networks, one detecting when animals touch or cross and another one for animal identification, trained adaptively to conditions and difficulty of the video.

* 44 pages, 1 main figure, 13 supplementary figures, 6 tables

Via

Access Paper or Ask Questions