Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Edward Hirst

Minimising Willmore Energy via Neural Flow

Apr 06, 2026

Edward Hirst, Henrique N. Sá Earp, Tomás S. R. Silva

Abstract:The neural Willmore flow of a closed oriented $2$-surface in $\mathbb{R}^3$ is introduced as a natural evolution process to minimise the Willmore energy, which is the squared $L^2$-norm of mean curvature. Neural architectures are used to model maps from topological $2d$ domains to $3d$ Euclidean space, where the learning process minimises a PINN-style loss for the Willmore energy as a functional on the embedding. Training reproduces the expected round sphere for genus $0$ surfaces, and the Clifford torus for genus $1$ surfaces, respectively. Furthermore, the experiment in the genus $2$ case provides a novel approach to search for minimal Willmore surfaces in this open problem.

* 16+5 pages, 9 figures

Via

Access Paper or Ask Questions

A Machine Learning Approach to the Nirenberg Problem

Feb 12, 2026

Gianfranco Cortés, Maria Esteban-Casadevall, Yueqing Feng, Jonas Henkel, Edward Hirst, Tancredi Schettini Gherardini, Alexander G. Stapleton

Abstract:This work introduces the Nirenberg Neural Network: a numerical approach to the Nirenberg problem of prescribing Gaussian curvature on $S^2$ for metrics that are pointwise conformal to the round metric. Our mesh-free physics-informed neural network (PINN) approach directly parametrises the conformal factor globally and is trained with a geometry-aware loss enforcing the curvature equation. Additional consistency checks were performed via the Gauss-Bonnet theorem, and spherical-harmonic expansions were fit to the learnt models to provide interpretability. For prescribed curvatures with known realisability, the neural network achieves very low losses ($10^{-7} - 10^{-10}$), while unrealisable curvatures yield significantly higher losses. This distinction enables the assessment of unknown cases, separating likely realisable functions from non-realisable ones. The current capabilities of the Nirenberg Neural Network demonstrate that neural solvers can serve as exploratory tools in geometric analysis, offering a quantitative computational perspective on longstanding existence questions.

* 38 pages, 14 pages, 7 tables

Via

Access Paper or Ask Questions

Neural and numerical methods for $\mathrm{G}_2$-structures on contact Calabi-Yau 7-manifolds

Feb 12, 2026

Elli Heyes, Edward Hirst, Henrique N. Sá Earp, Tomás S. R. Silva

Abstract:A numerical framework for approximating $\mathrm{G}_2$-structure 3-forms on contact Calabi-Yau manifolds is presented. The approach proceeds in three stages: first, existing neural network models are employed to compute an approximate Ricci-flat metric on a Calabi-Yau threefold. Second, using this metric and the explicit construction of a $\mathrm{G}_2$-structure on the associated 7-dimensional Calabi-Yau link in the 9-sphere, numerical approximations of the 3-form are generated on a large set of sampled points. Finally, a dedicated neural architecture is trained to learn the 3-form and its induced Riemannian metric directly from data, validating the learned structure and its torsion via a numerical implementation of the exterior derivative, which may be of independent interest.

* 8+5 pages, 9 figures

Via

Access Paper or Ask Questions

Versor: A Geometric Sequence Architecture

Feb 10, 2026

Truong Minh Huy, Edward Hirst

Abstract:A novel sequence architecture design is introduced, Versor, which uses Conformal Geometric Algebra (CGA) in place of the traditional fundamental non-linear operations to achieve structural generalization and significant performance improvements on a variety of tasks, while offering improved interpretability and efficiency. By embedding states in the $Cl_{4,1}$ manifold and evolving them via geometric transformations (rotors), Versor natively represents $SE(3)$-equivariant relationships without requiring explicit structural encoding. Versor is validated on chaotic N-body dynamics, topological reasoning, and standard multimodal benchmarks (CIFAR-10, WikiText-103), consistently outperforming Transformers, Graph Networks, and geometric baselines (GATr, EGNN). Key results include: orders of magnitude fewer parameters ($200\times$ vs. Transformers); interpretable attention decomposing into proximity and orientational components; zero-shot scale generalization (99.3% MCC on topology vs. 50.4% for ViT); and $O(L)$ linear complexity via the novel Recursive Rotor Accumulator. In out-of-distribution tests, Versor maintains stable predictions while Transformers fail catastrophically. Custom Clifford kernels achieve up to $78\times$ speedup, providing a scalable foundation for geometrically-aware scientific modeling.

* 16+23 pages, 3 figures

Via

Access Paper or Ask Questions

Grokking vs. Learning: Same Features, Different Encodings

Feb 03, 2025

Dmitry Manning-Coe, Jacopo Gliozzi, Alexander G. Stapleton, Edward Hirst, Giuseppe De Tomasi, Barry Bradlyn, David S. Berman

Figure 1 for Grokking vs. Learning: Same Features, Different Encodings

Figure 2 for Grokking vs. Learning: Same Features, Different Encodings

Figure 3 for Grokking vs. Learning: Same Features, Different Encodings

Figure 4 for Grokking vs. Learning: Same Features, Different Encodings

Abstract:Grokking typically achieves similar loss to ordinary, "steady", learning. We ask whether these different learning paths - grokking versus ordinary training - lead to fundamental differences in the learned models. To do so we compare the features, compressibility, and learning dynamics of models trained via each path in two tasks. We find that grokked and steadily trained models learn the same features, but there can be large differences in the efficiency with which these features are encoded. In particular, we find a novel "compressive regime" of steady training in which there emerges a linear trade-off between model loss and compressibility, and which is absent in grokking. In this regime, we can achieve compression factors 25x times the base model, and 5x times the compression achieved in grokking. We then track how model features and compressibility develop through training. We show that model development in grokking is task-dependent, and that peak compressibility is achieved immediately after the grokking plateau. Finally, novel information-geometric measures are introduced which demonstrate that models undergoing grokking follow a straight path in information space.

* Code available at: https://github.com/xand-stapleton/grokking_vs_learning

Via

Access Paper or Ask Questions

Machine Learning Mutation-Acyclicity of Quivers

Nov 06, 2024

Kymani T. K. Armstrong-Williams, Edward Hirst, Blake Jackson, Kyu-Hwan Lee

Figure 1 for Machine Learning Mutation-Acyclicity of Quivers

Figure 2 for Machine Learning Mutation-Acyclicity of Quivers

Figure 3 for Machine Learning Mutation-Acyclicity of Quivers

Figure 4 for Machine Learning Mutation-Acyclicity of Quivers

Abstract:Machine learning (ML) has emerged as a powerful tool in mathematical research in recent years. This paper applies ML techniques to the study of quivers--a type of directed multigraph with significant relevance in algebra, combinatorics, computer science, and mathematical physics. Specifically, we focus on the challenging problem of determining the mutation-acyclicity of a quiver on 4 vertices, a property that is pivotal since mutation-acyclicity is often a necessary condition for theorems involving path algebras and cluster algebras. Although this classification is known for quivers with at most 3 vertices, little is known about quivers on more than 3 vertices. We give a computer-assisted proof of a theorem to prove that mutation-acyclicity is decidable for quivers on 4 vertices with edge weight at most 2. By leveraging neural networks (NNs) and support vector machines (SVMs), we then accurately classify more general 4-vertex quivers as mutation-acyclic or non-mutation-acyclic. Our results demonstrate that ML models can efficiently detect mutation-acyclicity, providing a promising computational approach to this combinatorial problem, from which the trained SVM equation provides a starting point to guide future theoretical development.

* 30 pages, 14 figures, 8 tables

Via

Access Paper or Ask Questions

Learning 3-Manifold Triangulations

May 15, 2024

Francesco Costantino, Yang-Hui He, Elli Heyes, Edward Hirst

Abstract:Real 3-manifold triangulations can be uniquely represented by isomorphism signatures. Databases of these isomorphism signatures are generated for a variety of 3-manifolds and knot complements, using SnapPy and Regina, then these language-like inputs are used to train various machine learning architectures to differentiate the manifolds, as well as their Dehn surgeries, via their triangulations. Gradient saliency analysis then extracts key parts of this language-like encoding scheme from the trained models. The isomorphism signature databases are taken from the 3-manifolds' Pachner graphs, which are also generated in bulk for some selected manifolds of focus and for the subset of the SnapPy orientable cusped census with $<8$ initial tetrahedra. These Pachner graphs are further analysed through the lens of network science to identify new structure in the triangulation representation; in particular for the hyperbolic case, a relation between the length of the shortest geodesic (systole) and the size of the Pachner graph's ball is observed.

* 35 pages; 23 figures; 7 tables

Via

Access Paper or Ask Questions

Calabi-Yau Four/Five/Six-folds as $\mathbb{P}^n_\textbf{w}$ Hypersurfaces: Machine Learning, Approximation, and Generation

Nov 28, 2023

Edward Hirst, Tancredi Schettini Gherardini

$Figure 1 for Calabi-Yau Four/Five/Six-folds as $\mathbb{P}^n_\textbf{w}$ Hypersurfaces: Machine Learning, Approximation, and Generation$

$Figure 2 for Calabi-Yau Four/Five/Six-folds as $\mathbb{P}^n_\textbf{w}$ Hypersurfaces: Machine Learning, Approximation, and Generation$

$Figure 3 for Calabi-Yau Four/Five/Six-folds as $\mathbb{P}^n_\textbf{w}$ Hypersurfaces: Machine Learning, Approximation, and Generation$

$Figure 4 for Calabi-Yau Four/Five/Six-folds as $\mathbb{P}^n_\textbf{w}$ Hypersurfaces: Machine Learning, Approximation, and Generation$

Abstract:Calabi-Yau four-folds may be constructed as hypersurfaces in weighted projective spaces of complex dimension 5 defined via weight systems of 6 weights. In this work, neural networks were implemented to learn the Calabi-Yau Hodge numbers from the weight systems, where gradient saliency and symbolic regression then inspired a truncation of the Landau-Ginzburg model formula for the Hodge numbers of any dimensional Calabi-Yau constructed in this way. The approximation always provides a tight lower bound, is shown to be dramatically quicker to compute (with compute times reduced by up to four orders of magnitude), and gives remarkably accurate results for systems with large weights. Additionally, complementary datasets of weight systems satisfying the necessary but insufficient conditions for transversality were constructed, including considerations of the IP, reflexivity, and intradivisibility properties. Overall producing a classification of this weight system landscape, further confirmed with machine learning methods. Using the knowledge of this classification, and the properties of the presented approximation, a novel dataset of transverse weight systems consisting of 7 weights was generated for a sum of weights $\leq 200$; producing a new database of Calabi-Yau five-folds, with their respective topological properties computed. Further to this an equivalent database of candidate Calabi-Yau six-folds was generated with approximated Hodge numbers.

* 37 pages; 13 figures; 12 tables

Via

Access Paper or Ask Questions

Machine Learning Clifford invariants of ADE Coxeter elements

Sep 29, 2023

Siqi Chen, Pierre-Philippe Dechant, Yang-Hui He, Elli Heyes, Edward Hirst, Dmitrii Riabchenko

Abstract:There has been recent interest in novel Clifford geometric invariants of linear transformations. This motivates the investigation of such invariants for a certain type of geometric transformation of interest in the context of root systems, reflection groups, Lie groups and Lie algebras: the Coxeter transformations. We perform exhaustive calculations of all Coxeter transformations for $A_8$, $D_8$ and $E_8$ for a choice of basis of simple roots and compute their invariants, using high-performance computing. This computational algebra paradigm generates a dataset that can then be mined using techniques from data science such as supervised and unsupervised machine learning. In this paper we focus on neural network classification and principal component analysis. Since the output -- the invariants -- is fully determined by the choice of simple roots and the permutation order of the corresponding reflections in the Coxeter element, we expect huge degeneracy in the mapping. This provides the perfect setup for machine learning, and indeed we see that the datasets can be machine learned to very high accuracy. This paper is a pump-priming study in experimental mathematics using Clifford algebras, showing that such Clifford algebraic datasets are amenable to machine learning, and shedding light on relationships between these novel and other well-known geometric invariants and also giving rise to analytic results.

* 34 pages, 16 Figures, 12 Tables

Via

Access Paper or Ask Questions

Machine Learning Algebraic Geometry for Physics

Apr 21, 2022

Jiakang Bao, Yang-Hui He, Elli Heyes, Edward Hirst

Figure 1 for Machine Learning Algebraic Geometry for Physics

Figure 2 for Machine Learning Algebraic Geometry for Physics

Figure 3 for Machine Learning Algebraic Geometry for Physics

Figure 4 for Machine Learning Algebraic Geometry for Physics

Abstract:We review some recent applications of machine learning to algebraic geometry and physics. Since problems in algebraic geometry can typically be reformulated as mappings between tensors, this makes them particularly amenable to supervised learning. Additionally, unsupervised methods can provide insight into the structure of such geometrical data. At the heart of this programme is the question of how geometry can be machine learned, and indeed how AI helps one to do mathematics. This is a chapter contribution to the book Machine learning and Algebraic Geometry, edited by A. Kasprzyk et al.

* 32 pages, 25 figures. Contribution to Machine learning and Algebraic Geometry, edited by A. Kasprzyk et al

Via

Access Paper or Ask Questions