Abstract:Air hockey demands split-second decisions at high puck velocities, a challenge we address with a compact network of spiking neurons running on a mixed-signal analog/digital neuromorphic processor. By co-designing hardware and learning algorithms, we train the system to achieve successful puck interactions through reinforcement learning in a remarkably small number of trials. The network leverages fixed random connectivity to capture the task's temporal structure and adopts a local e-prop learning rule in the readout layer to exploit event-driven activity for fast and efficient learning. The result is real-time learning with a setup comprising a computer and the neuromorphic chip in-the-loop, enabling practical training of spiking neural networks for robotic autonomous systems. This work bridges neuroscience-inspired hardware with real-world robotic control, showing that brain-inspired approaches can tackle fast-paced interaction tasks while supporting always-on learning in intelligent machines.




Abstract:Achieving energy efficiency in learning is a key challenge for artificial intelligence (AI) computing platforms. Biological systems demonstrate remarkable abilities to learn complex skills quickly and efficiently. Inspired by this, we present a hardware implementation of model-based reinforcement learning (MBRL) using spiking neural networks (SNNs) on mixed-signal analog/digital neuromorphic hardware. This approach leverages the energy efficiency of mixed-signal neuromorphic chips while achieving high sample efficiency through an alternation of online learning, referred to as the "awake" phase, and offline learning, known as the "dreaming" phase. The model proposed includes two symbiotic networks: an agent network that learns by combining real and simulated experiences, and a learned world model network that generates the simulated experiences. We validate the model by training the hardware implementation to play the Atari game Pong. We start from a baseline consisting of an agent network learning without a world model and dreaming, which successfully learns to play the game. By incorporating dreaming, the number of required real game experiences are reduced significantly compared to the baseline. The networks are implemented using a mixed-signal neuromorphic processor, with the readout layers trained using a computer in-the-loop, while the other layers remain fixed. These results pave the way toward energy-efficient neuromorphic learning systems capable of rapid learning in real world applications and use-cases.