Current unsupervised learning methods depend on end-to-end training via deep learning techniques such as self-supervised learning, with high computational requirements, or employ layer-by-layer training using bio-inspired approaches like Hebbian learning, using local learning rules incompatible with supervised learning. Both approaches are problematic for edge AI hardware that relies on sparse computational resources and would strongly benefit from alternating between unsupervised and supervised learning phases - thus leveraging widely available unlabeled data from the environment as well as labeled training datasets. To solve this challenge, in this work, we introduce a 'self-defined target' that uses Winner-Take-All (WTA) selectivity at the network's final layer, complemented by regularization through biologically inspired homeostasis mechanism. This approach, framework-agnostic and compatible with both global (Backpropagation) and local (Equilibrium propagation) learning rules, achieves a 97.6% test accuracy on the MNIST dataset. Furthermore, we demonstrate that incorporating a hidden layer enhances classification accuracy and the quality of learned features across all training methods, showcasing the advantages of end-to-end unsupervised training. Extending to semi-supervised learning, our method dynamically adjusts the target according to data availability, reaching a 96.6% accuracy with just 600 labeled MNIST samples. This result highlights our 'unsupervised target' strategy's efficacy and flexibility in scenarios ranging from abundant to no labeled data availability.
Extracting information from radiofrequency (RF) signals using artificial neural networks at low energy cost is a critical need for a wide range of applications. Here we show how to leverage the intrinsic dynamics of spintronic nanodevices called magnetic tunnel junctions to process multiple analogue RF inputs in parallel and perform synaptic operations. Furthermore, we achieve classification of RF signals with experimental data from magnetic tunnel junctions as neurons and synapses, with the same accuracy as an equivalent software neural network. These results are a key step for embedded radiofrequency artificial intelligence.
Neuromorphic computing approaches become increasingly important as we address future needs for efficiently processing massive amounts of data. The unique attributes of quantum materials can help address these needs by enabling new energy-efficient device concepts that implement neuromorphic ideas at the hardware level. In particular, strong correlations give rise to highly non-linear responses, such as conductive phase transitions that can be harnessed for short and long-term plasticity. Similarly, magnetization dynamics are strongly non-linear and can be utilized for data classification. This paper discusses select examples of these approaches, and provides a perspective for the current opportunities and challenges for assembling quantum-material-based devices for neuromorphic functionalities into larger emergent complex network systems.
Deep learning has an increasing impact to assist research, allowing, for example, the discovery of novel materials. Until now, however, these artificial intelligence techniques have fallen short of discovering the full differential equation of an experimental physical system. Here we show that a dynamical neural network, trained on a minimal amount of data, can predict the behavior of spintronic devices with high accuracy and an extremely efficient simulation time, compared to the micromagnetic simulations that are usually employed to model them. For this purpose, we re-frame the formalism of Neural Ordinary Differential Equations (ODEs) to the constraints of spintronics: few measured outputs, multiple inputs and internal parameters. We demonstrate with Spin-Neural ODEs an acceleration factor over 200 compared to micromagnetic simulations for a complex problem -- the simulation of a reservoir computer made of magnetic skyrmions (20 minutes compared to three days). In a second realization, we show that we can predict the noisy response of experimental spintronic nano-oscillators to varying inputs after training Spin-Neural ODEs on five milliseconds of their measured response to different excitations. Spin-Neural ODE is a disruptive tool for developing spintronic applications in complement to micromagnetic simulations, which are time-consuming and cannot fit experiments when noise or imperfections are present. Spin-Neural ODE can also be generalized to other electronic devices involving dynamics.
Equilibrium Propagation (EP) is an algorithm intrinsically adapted to the training of physical networks, in particular thanks to the local updates of weights given by the internal dynamics of the system. However, the construction of such a hardware requires to make the algorithm compatible with the existing neuromorphic CMOS technology, which generally exploits digital communication between neurons and offers a limited amount of local memory. In this work, we demonstrate that EP can train dynamical networks with binary activations and weights. We first train systems with binary weights and full-precision activations, achieving an accuracy equivalent to that of full-precision models trained by standard EP on MNIST, and losing only 1.9% accuracy on CIFAR-10 with equal architecture. We then extend our method to the training of models with binary activations and weights on MNIST, achieving an accuracy within 1% of the full-precision reference for fully connected architectures and reaching the full-precision reference accuracy for the convolutional architecture. Our extension of EP training to binary networks is consistent with the requirements of today's dynamic, brain-inspired hardware platforms and paves the way for very low-power end-to-end learning.
Equilibrium Propagation (EP) is a biologically-inspired counterpart of Backpropagation Through Time (BPTT) which, owing to its strong theoretical guarantees and the locality in space of its learning rule, fosters the design of energy-efficient hardware dedicated to learning. In practice, however, EP does not scale to visual tasks harder than MNIST. In this work, we show that a bias in the gradient estimate of EP, inherent in the use of finite nudging, is responsible for this phenomenon and that cancelling it allows training deep ConvNets by EP, including architectures with distinct forward and backward connections. These results highlight EP as a scalable approach to compute error gradients in deep neural networks, thereby motivating its hardware implementation.
Neuromorphic systems achieve high energy efficiency by computing with spikes, in a brain-inspired way. However, finding spike-based learning algorithms that can be implemented within the local constraints of neuromorphic systems, while achieving high accuracy, remains a formidable challenge. Equilibrium Propagation is a hardware-friendly counterpart of backpropagation which only involves spatially local computations and applies to recurrent neural networks with static inputs. So far, hardware-oriented studies of Equilibrium Propagation focused on rate-based networks. In this work, we develop a spiking neural network algorithm called EqSpike, compatible with neuromorphic systems, which learns by Equilibrium Propagation. Through simulations, we obtain a test recognition accuracy of 96.9% on MNIST, similar to rate-based Equilibrium Propagation, and comparing favourably to alternative learning techniques for spiking neural networks. We show that EqSpike implemented in silicon neuromorphic technology could reduce the energy consumption of inference and training by up to three orders of magnitude compared to GPUs. Finally, we also show that during learning, EqSpike weight updates exhibit a form of Spike Timing Dependent Plasticity, highlighting a possible connection with biology.
Equilibrium Propagation (EP) is a learning algorithm that bridges Machine Learning and Neuroscience, by computing gradients closely matching those of Backpropagation Through Time (BPTT), but with a learning rule local in space. Given an input $x$ and associated target $y$, EP proceeds in two phases: in the first phase neurons evolve freely towards a first steady state; in the second phase output neurons are nudged towards $y$ until they reach a second steady state. However, in existing implementations of EP, the learning rule is not local in time: the weight update is performed after the dynamics of the second phase have converged and requires information of the first phase that is no longer available physically. In this work, we propose a version of EP named Continual Equilibrium Propagation (C-EP) where neuron and synapse dynamics occur simultaneously throughout the second phase, so that the weight update becomes local in time. Such a learning rule local both in space and time opens the possibility of an extremely energy efficient hardware implementation of EP. We prove theoretically that, provided the learning rates are sufficiently small, at each time step of the second phase the dynamics of neurons and synapses follow the gradients of the loss given by BPTT (Theorem 1). We demonstrate training with C-EP on MNIST and generalize C-EP to neural networks where neurons are connected by asymmetric connections. We show through experiments that the more the network updates follows the gradients of BPTT, the best it performs in terms of training. These results bring EP a step closer to biology by better complying with hardware constraints while maintaining its intimate link with backpropagation.
Equilibrium Propagation (EP) is a biologically inspired alternative algorithm to backpropagation (BP) for training neural networks. It applies to RNNs fed by a static input x that settle to a steady state, such as Hopfield networks. EP is similar to BP in that in the second phase of training, an error signal propagates backwards in the layers of the network, but contrary to BP, the learning rule of EP is spatially local. Nonetheless, EP suffers from two major limitations. On the one hand, due to its formulation in terms of real-time dynamics, EP entails long simulation times, which limits its applicability to practical tasks. On the other hand, the biological plausibility of EP is limited by the fact that its learning rule is not local in time: the synapse update is performed after the dynamics of the second phase have converged and requires information of the first phase that is no longer available physically. Our work addresses these two issues and aims at widening the spectrum of EP from standard machine learning models to more bio-realistic neural networks. First, we propose a discrete-time formulation of EP which enables to simplify equations, speed up training and extend EP to CNNs. Our CNN model achieves the best performance ever reported on MNIST with EP. Using the same discrete-time formulation, we introduce Continual Equilibrium Propagation (C-EP): the weights of the network are adjusted continually in the second phase of training using local information in space and time. We show that in the limit of slow changes of synaptic strengths and small nudging, C-EP is equivalent to BPTT (Theorem 1). We numerically demonstrate Theorem 1 and C-EP training on MNIST and generalize it to the bio-realistic situation of a neural network with asymmetric connections between neurons.