Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aurélien Decelle

The Symmetric Perceptron: a Teacher-Student Scenario

Mar 26, 2026

Giovanni Catania, Aurélien Decelle, Suhanee Korpe

Abstract:We introduce and solve a teacher-student formulation of the symmetric binary Perceptron, turning a traditionally storage-oriented model into a planted inference problem with a guaranteed solution at any sample density. We adapt the formulation of the symmetric Perceptron which traditionally considers either the u-shaped potential or the rectangular one, by including labels in both regions. With this formulation, we analyze both the Bayes-optimal regime at for noise-less examples and the effect of thermal noise under two different potential/classification rules. Using annealed and quenched free-entropy calculations in the high-dimensional limit, we map the phase diagram in the three control parameters, namely the sample density $α$, the distance between the origin and one of the symmetric hyperplanes $κ$ and temperature $T$, and identify a robust scenario where learning is organized by a second-order instability that creates teacher-correlated suboptimal states, followed by a first-order transition to full alignment. We show how this structure depends on the choice of potential, the interplay between metastability of the suboptimal solution and its melting towards the planted configuration, which is relevant for Monte Carlo-based optimization algorithms.

* 19 pages, 6 figures

Via

Access Paper or Ask Questions

On the role of non-linear latent features in bipartite generative neural networks

Jun 12, 2025

Tony Bonnaire, Giovanni Catania, Aurélien Decelle, Beatriz Seoane

Figure 1 for On the role of non-linear latent features in bipartite generative neural networks

Figure 2 for On the role of non-linear latent features in bipartite generative neural networks

Figure 3 for On the role of non-linear latent features in bipartite generative neural networks

Figure 4 for On the role of non-linear latent features in bipartite generative neural networks

Abstract:We investigate the phase diagram and memory retrieval capabilities of bipartite energy-based neural networks, namely Restricted Boltzmann Machines (RBMs), as a function of the prior distribution imposed on their hidden units - including binary, multi-state, and ReLU-like activations. Drawing connections to the Hopfield model and employing analytical tools from statistical physics of disordered systems, we explore how the architectural choices and activation functions shape the thermodynamic properties of these models. Our analysis reveals that standard RBMs with binary hidden nodes and extensive connectivity suffer from reduced critical capacity, limiting their effectiveness as associative memories. To address this, we examine several modifications, such as introducing local biases and adopting richer hidden unit priors. These adjustments restore ordered retrieval phases and markedly improve recall performance, even at finite temperatures. Our theoretical findings, supported by finite-size Monte Carlo simulations, highlight the importance of hidden unit design in enhancing the expressive power of RBMs.

* 23 pages, 5 figures

Via

Access Paper or Ask Questions

A theoretical framework for overfitting in energy-based modeling

Jan 31, 2025

Giovanni Catania, Aurélien Decelle, Cyril Furtlehner, Beatriz Seoane

Figure 1 for A theoretical framework for overfitting in energy-based modeling

Figure 2 for A theoretical framework for overfitting in energy-based modeling

Figure 3 for A theoretical framework for overfitting in energy-based modeling

Figure 4 for A theoretical framework for overfitting in energy-based modeling

Abstract:We investigate the impact of limited data on training pairwise energy-based models for inverse problems aimed at identifying interaction networks. Utilizing the Gaussian model as testbed, we dissect training trajectories across the eigenbasis of the coupling matrix, exploiting the independent evolution of eigenmodes and revealing that the learning timescales are tied to the spectral decomposition of the empirical covariance matrix. We see that optimal points for early stopping arise from the interplay between these timescales and the initial conditions of training. Moreover, we show that finite data corrections can be accurately modeled through asymptotic random matrix theory calculations and provide the counterpart of generalized cross-validation in the energy based model context. Our analytical framework extends to binary-variable maximum-entropy pairwise models with minimal variations. These findings offer strategies to control overfitting in discrete-variable models through empirical shrinkage corrections, improving the management of overfitting in energy-based generative models.

* 23 pages, 13 figures (including appendix)

Via

Access Paper or Ask Questions

Inferring High-Order Couplings with Neural Networks

Jan 10, 2025

Aurélien Decelle, Alfonso de Jesús Navas Gómez, Beatriz Seoane

Figure 1 for Inferring High-Order Couplings with Neural Networks

Figure 2 for Inferring High-Order Couplings with Neural Networks

Figure 3 for Inferring High-Order Couplings with Neural Networks

Figure 4 for Inferring High-Order Couplings with Neural Networks

Abstract:Maximum-entropy methods, rooted in the inverse Ising/Potts problem from statistical mechanics, have become indispensable tools for modeling pairwise interactions in disciplines such as bioinformatics, ecology, and neuroscience. Despite their remarkable success, these methods often overlook high-order interactions that may be crucial in complex systems. Conversely, while modern machine learning approaches can capture such interactions, existing interpretable frameworks are computationally expensive, making it impractical to assess the relevance of high-order interactions in real-world scenarios. Restricted Boltzmann Machines (RBMs) offer a computationally efficient alternative by encoding statistical correlations via hidden nodes in a bipartite neural network. Here, we present a method that maps RBMs exactly onto generalized Potts models with interactions of arbitrary high order. This approach leverages large-$N$ approximations, facilitated by the simple architecture of the RBM, to enable the efficient extraction of effective many-body couplings with minimal computational cost. This mapping also enables the development of a general formal framework for the extraction of effective higher-order interactions in arbitrarily complex probabilistic models. Additionally, we introduce a robust formalism for gauge fixing within the generalized Potts model. We validate our method by accurately recovering two- and three-body interactions from synthetic datasets. Additionally, applying our framework to protein sequence data demonstrates its effectiveness in reconstructing protein contact maps, achieving performance comparable to state-of-the-art inverse Potts models. These results position RBMs as a powerful and efficient tool for investigating high-order interactions in complex systems.

* 13 Pages and 3 Figures

Via

Access Paper or Ask Questions

Fast, accurate training and sampling of Restricted Boltzmann Machines

May 24, 2024

Nicolas Béreux, Aurélien Decelle, Cyril Furtlehner, Lorenzo Rosset, Beatriz Seoane

Figure 1 for Fast, accurate training and sampling of Restricted Boltzmann Machines

Figure 2 for Fast, accurate training and sampling of Restricted Boltzmann Machines

Figure 3 for Fast, accurate training and sampling of Restricted Boltzmann Machines

Figure 4 for Fast, accurate training and sampling of Restricted Boltzmann Machines

Abstract:Thanks to their simple architecture, Restricted Boltzmann Machines (RBMs) are powerful tools for modeling complex systems and extracting interpretable insights from data. However, training RBMs, as other energy-based models, on highly structured data poses a major challenge, as effective training relies on mixing the Markov chain Monte Carlo simulations used to estimate the gradient. This process is often hindered by multiple second-order phase transitions and the associated critical slowdown. In this paper, we present an innovative method in which the principal directions of the dataset are integrated into a low-rank RBM through a convex optimization procedure. This approach enables efficient sampling of the equilibrium measure via a static Monte Carlo process. By starting the standard training process with a model that already accurately represents the main modes of the data, we bypass the initial phase transitions. Our results show that this strategy successfully trains RBMs to capture the full diversity of data in datasets where previous methods fail. Furthermore, we use the training trajectories to propose a new sampling method, {\em parallel trajectory tempering}, which allows us to sample the equilibrium measure of the trained model much faster than previous optimized MCMC approaches and a better estimation of the log-likelihood. We illustrate the success of the training method on several highly structured datasets.

* 18 pages, 8 figures

Via

Access Paper or Ask Questions

Cascade of phase transitions in the training of Energy-based models

May 23, 2024

Dimitrios Bachtis, Giulio Biroli, Aurélien Decelle, Beatriz Seoane

Abstract:In this paper, we investigate the feature encoding process in a prototypical energy-based generative model, the Restricted Boltzmann Machine (RBM). We start with an analytical investigation using simplified architectures and data structures, and end with numerical analysis of real trainings on real datasets. Our study tracks the evolution of the model's weight matrix through its singular value decomposition, revealing a series of phase transitions associated to a progressive learning of the principal modes of the empirical probability distribution. The model first learns the center of mass of the modes and then progressively resolve all modes through a cascade of phase transitions. We first describe this process analytically in a controlled setup that allows us to study analytically the training dynamics. We then validate our theoretical results by training the Bernoulli-Bernoulli RBM on real data sets. By using data sets of increasing dimension, we show that learning indeed leads to sharp phase transitions in the high-dimensional limit. Moreover, we propose and test a mean-field finite-size scaling hypothesis. This shows that the first phase transition is in the same universality class of the one we studied analytically, and which is reminiscent of the mean-field paramagnetic-to-ferromagnetic phase transition.

* 19 pages, 6 figures

Via

Access Paper or Ask Questions

Predicting large scale cosmological structure evolution with GAN-based autoencoders

Mar 04, 2024

Marion Ullmo, Nabila Aghnim, Aurélien Decelle, Miguel Aragon-Calvo

Figure 1 for Predicting large scale cosmological structure evolution with GAN-based autoencoders

Figure 2 for Predicting large scale cosmological structure evolution with GAN-based autoencoders

Figure 3 for Predicting large scale cosmological structure evolution with GAN-based autoencoders

Figure 4 for Predicting large scale cosmological structure evolution with GAN-based autoencoders

Abstract:Cosmological simulations play a key role in the prediction and understanding of large scale structure formation from initial conditions. We make use of GAN-based Autoencoders (AEs) in an attempt to predict structure evolution within simulations. The AEs are trained on images and cubes issued from respectively 2D and 3D N-body simulations describing the evolution of the dark matter (DM) field. We find that while the AEs can predict structure evolution for 2D simulations of DM fields well, using only the density fields as input, they perform significantly more poorly in similar conditions for 3D simulations. However, additionally providing velocity fields as inputs greatly improves results, with similar predictions regardless of time-difference between input and target.

* 11 pages, 11 figures

Via

Access Paper or Ask Questions

Inferring effective couplings with Restricted Boltzmann Machines

Sep 20, 2023

Aurélien Decelle, Cyril Furtlehner, Alfonso De Jesus Navas Gómez, Beatriz Seoane

Figure 1 for Inferring effective couplings with Restricted Boltzmann Machines

Figure 2 for Inferring effective couplings with Restricted Boltzmann Machines

Figure 3 for Inferring effective couplings with Restricted Boltzmann Machines

Figure 4 for Inferring effective couplings with Restricted Boltzmann Machines

Abstract:Generative models offer a direct way to model complex data. Among them, energy-based models provide us with a neural network model that aims to accurately reproduce all statistical correlations observed in the data at the level of the Boltzmann weight of the model. However, one challenge is to understand the physical interpretation of such models. In this study, we propose a simple solution by implementing a direct mapping between the energy function of the Restricted Boltzmann Machine and an effective Ising spin Hamiltonian that includes high-order interactions between spins. This mapping includes interactions of all possible orders, going beyond the conventional pairwise interactions typically considered in the inverse Ising approach, and allowing the description of complex datasets. Earlier works attempted to achieve this goal, but the proposed mappings did not do properly treat the complexity of the problem or did not contain direct prescriptions for practical application. To validate our method, we performed several controlled numerical experiments where we trained the RBMs using equilibrium samples of predefined models containing local external fields, two-body and three-body interactions in various low-dimensional topologies. The results demonstrate the effectiveness of our proposed approach in learning the correct interaction network and pave the way for its application in modeling interesting datasets. We also evaluate the quality of the inferred model based on different training methods.

* 15 figures, 31 pages

Via

Access Paper or Ask Questions

The Copycat Perceptron: Smashing Barriers Through Collective Learning

Aug 07, 2023

Giovanni Catania, Aurélien Decelle, Beatriz Seoane

Figure 1 for The Copycat Perceptron: Smashing Barriers Through Collective Learning

Figure 2 for The Copycat Perceptron: Smashing Barriers Through Collective Learning

Figure 3 for The Copycat Perceptron: Smashing Barriers Through Collective Learning

Figure 4 for The Copycat Perceptron: Smashing Barriers Through Collective Learning

Abstract:We characterize the equilibrium properties of a model of $y$ coupled binary perceptrons in the teacher-student scenario, subject to a suitable learning rule, with an explicit ferromagnetic coupling proportional to the Hamming distance between the students' weights. In contrast to recent works, we analyze a more general setting in which a thermal noise is present that affects the generalization performance of each student. Specifically, in the presence of a nonzero temperature, which assigns nonzero probability to configurations that misclassify samples with respect to the teacher's prescription, we find that the coupling of replicas leads to a shift of the phase diagram to smaller values of $\alpha$: This suggests that the free energy landscape gets smoother around the solution with good generalization (i.e., the teacher) at a fixed fraction of reviewed examples, which allows local update algorithms such as Simulated Annealing to reach the solution before the dynamics gets frozen. Finally, from a learning perspective, these results suggest that more students (in this case, with the same amount of data) are able to learn the same rule when coupled together with a smaller amount of data.

* 4 figures

Via

Access Paper or Ask Questions

Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics

Jul 13, 2023

Alessandra Carbone, Aurélien Decelle, Lorenzo Rosset, Beatriz Seoane

Figure 1 for Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics

Figure 2 for Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics

Figure 3 for Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics

Figure 4 for Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics

Abstract:In this study, we address the challenge of using energy-based models to produce high-quality, label-specific data in complex structured datasets, such as population genetics, RNA or protein sequences data. Traditional training methods encounter difficulties due to inefficient Markov chain Monte Carlo mixing, which affects the diversity of synthetic data and increases generation times. To address these issues, we use a novel training algorithm that exploits non-equilibrium effects. This approach, applied on the Restricted Boltzmann Machine, improves the model's ability to correctly classify samples and generate high-quality synthetic data in only a few sampling steps. The effectiveness of this method is demonstrated by its successful application to four different types of data: handwritten digits, mutations of human genomes classified by continental origin, functionally characterized sequences of an enzyme protein family, and homologous RNA sequences from specific taxonomies.

* 15 pages

Via

Access Paper or Ask Questions