In-context learning (ICL), a property demonstrated by transformer-based sequence models, refers to the automatic inference of an input-output mapping based on examples of the mapping provided as context. ICL requires no explicit learning, i.e., no explicit updates of model weights, directly mapping context and new input to the new output. Prior work has proved the usefulness of ICL for detection in MIMO channels. In this setting, the context is given by pilot symbols, and ICL automatically adapts a detector, or equalizer, to apply to newly received signals. However, the implementation tested in prior art was based on conventional artificial neural networks (ANNs), which may prove too energy-demanding to be run on mobile devices. This paper evaluates a neuromorphic implementation of the transformer for ICL-based MIMO detection. This approach replaces ANNs with spiking neural networks (SNNs), and implements the attention mechanism via stochastic computing, requiring no multiplications, but only logical AND operations and counting. When using conventional digital CMOS hardware, the proposed implementation is shown to preserve accuracy, with a reduction in power consumption ranging from $5.4\times$ to $26.8\times$, depending on the model sizes, as compared to ANN-based implementations.
Spiking Neural Networks (SNNs) have been recently integrated into Transformer architectures due to their potential to reduce computational demands and to improve power efficiency. Yet, the implementation of the attention mechanism using spiking signals on general-purpose computing platforms remains inefficient. In this paper, we propose a novel framework leveraging stochastic computing (SC) to effectively execute the dot-product attention for SNN-based Transformers. We demonstrate that our approach can achieve high classification accuracy ($83.53\%$) on CIFAR-10 within 10 time steps, which is comparable to the performance of a baseline artificial neural network implementation ($83.66\%$). We estimate that the proposed SC approach can lead to over $6.3\times$ reduction in computing energy and $1.7\times$ reduction in memory access costs for a digital CMOS-based ASIC design. We experimentally validate our stochastic attention block design through an FPGA implementation, which is shown to achieve $48\times$ lower latency as compared to a GPU implementation, while consuming $15\times$ less power.
Bayesian neural networks offer better estimates of model uncertainty compared to frequentist networks. However, inference involving Bayesian models requires multiple instantiations or sampling of the network parameters, requiring significant computational resources. Compared to traditional deep learning networks, spiking neural networks (SNNs) have the potential to reduce computational area and power, thanks to their event-driven and spike-based computational framework. Most works in literature either address frequentist SNN models or non-spiking Bayesian neural networks. In this work, we demonstrate an optimization framework for developing and implementing efficient Bayesian SNNs in hardware by additionally restricting network weights to be binary-valued to further decrease power and area consumption. We demonstrate accuracies comparable to Bayesian binary networks with full-precision Bernoulli parameters, while requiring up to $25\times$ less spikes than equivalent binary SNN implementations. We show the feasibility of the design by mapping it onto Zynq-7000, a lightweight SoC, and achieve a $6.5 \times$ improvement in GOPS/DSP while utilizing up to 30 times less power compared to the state-of-the-art.
Spiking neural networks (SNNs) implemented on neuromorphic processors (NPs) can enhance the energy efficiency of deployments of artificial intelligence (AI) for specific workloads. As such, NP represents an interesting opportunity for implementing AI tasks on board power-limited satellite communication spacecraft. In this article, we disseminate the findings of a recently completed study which targeted the comparison in terms of performance and power-consumption of different satellite communication use cases implemented on standard AI accelerators and on NPs. In particular, the article describes three prominent use cases, namely payload resource optimization, onboard interference detection and classification, and dynamic receive beamforming; and compare the performance of conventional convolutional neural networks (CNNs) implemented on Xilinx's VCK5000 Versal development card and SNNs on Intel's neuromorphic chip Loihi 2.
Recent strides in low-latency spiking neural network (SNN) algorithms have drawn significant interest, particularly due to their event-driven computing nature and fast inference capability. One of the most efficient ways to construct a low-latency SNN is by converting a pre-trained, low-bit artificial neural network (ANN) into an SNN. However, this conversion process faces two main challenges: First, converting SNNs from low-bit ANNs can lead to ``occasional noise" -- the phenomenon where occasional spikes are generated in spiking neurons where they should not be -- during inference, which significantly lowers SNN accuracy. Second, although low-latency SNNs initially show fast improvements in accuracy with time steps, these accuracy growths soon plateau, resulting in their peak accuracy lagging behind both full-precision ANNs and traditional ``long-latency SNNs'' that prioritize precision over speed. In response to these two challenges, this paper introduces a novel technique named ``noise adaptor.'' Noise adaptor can model occasional noise during training and implicitly optimize SNN accuracy, particularly at high simulation times $T$. Our research utilizes the ResNet model for a comprehensive analysis of the impact of the noise adaptor on low-latency SNNs. The results demonstrate that our method outperforms the previously reported quant-ANN-to-SNN conversion technique. We achieved an accuracy of 95.95\% within 4 time steps on CIFAR-10 using ResNet-18, and an accuracy of 74.37\% within 64 time steps on ImageNet using ResNet-50. Remarkably, these results were obtained without resorting to any noise correction methods during SNN inference, such as negative spikes or two-stage SNN simulations. Our approach significantly boosts the peak accuracy of low-latency SNNs, bringing them on par with the accuracy of full-precision ANNs. Code will be open source.
Artificial intelligence (AI) algorithms based on neural networks have been designed for decades with the goal of maximising some measure of accuracy. This has led to two undesired effects. First, model complexity has risen exponentially when measured in terms of computation and memory requirements. Second, state-of-the-art AI models are largely incapable of providing trustworthy measures of their uncertainty, possibly `hallucinating' their answers and discouraging their adoption for decision-making in sensitive applications. With the goal of realising efficient and trustworthy AI, in this paper we highlight research directions at the intersection of hardware and software design that integrate physical insights into computational substrates, neuroscientific principles concerning efficient information processing, information-theoretic results on optimal uncertainty quantification, and communication-theoretic guidelines for distributed processing. Overall, the paper advocates for novel design methodologies that target not only accuracy but also uncertainty quantification, while leveraging emerging computing hardware architectures that move beyond the traditional von Neumann digital computing paradigm to embrace in-memory, neuromorphic, and quantum computing technologies. An important overarching principle of the proposed approach is to view the stochasticity inherent in the computational substrate and in the communication channels between processors as a resource to be leveraged for the purpose of representing and processing classical and quantum uncertainty.
The latest satellite communication (SatCom) missions are characterized by a fully reconfigurable on-board software-defined payload, capable of adapting radio resources to the temporal and spatial variations of the system traffic. As pure optimization-based solutions have shown to be computationally tedious and to lack flexibility, machine learning (ML)-based methods have emerged as promising alternatives. We investigate the application of energy-efficient brain-inspired ML models for on-board radio resource management. Apart from software simulation, we report extensive experimental results leveraging the recently released Intel Loihi 2 chip. To benchmark the performance of the proposed model, we implement conventional convolutional neural networks (CNN) on a Xilinx Versal VCK5000, and provide a detailed comparison of accuracy, precision, recall, and energy efficiency for different traffic demands. Most notably, for relevant workloads, spiking neural networks (SNNs) implemented on Loihi 2 yield higher accuracy, while reducing power consumption by more than 100$\times$ as compared to the CNN-based reference platform. Our findings point to the significant potential of neuromorphic computing and SNNs in supporting on-board SatCom operations, paving the way for enhanced efficiency and sustainability in future SatCom systems.
Brain-computer interfaces are being explored for a wide variety of therapeutic applications. Typically, this involves measuring and analyzing continuous-time electrical brain activity via techniques such as electrocorticogram (ECoG) or electroencephalography (EEG) to drive external devices. However, due to the inherent noise and variability in the measurements, the analysis of these signals is challenging and requires offline processing with significant computational resources. In this paper, we propose a simple yet efficient machine learning-based approach for the exemplary problem of hand gesture classification based on brain signals. We use a hybrid machine learning approach that uses a convolutional spiking neural network employing a bio-inspired event-driven synaptic plasticity rule for unsupervised feature learning of the measured analog signals encoded in the spike domain. We demonstrate that this approach generalizes to different subjects with both EEG and ECoG data and achieves superior accuracy in the range of 92.74-97.07% in identifying different hand gesture classes and motor imagery tasks.
Bayesian Neural Networks (BNNs) can overcome the problem of overconfidence that plagues traditional frequentist deep neural networks, and are hence considered to be a key enabler for reliable AI systems. However, conventional hardware realizations of BNNs are resource intensive, requiring the implementation of random number generators for synaptic sampling. Owing to their inherent stochasticity during programming and read operations, nanoscale memristive devices can be directly leveraged for sampling, without the need for additional hardware resources. In this paper, we introduce a novel Phase Change Memory (PCM)-based hardware implementation for BNNs with binary synapses. The proposed architecture consists of separate weight and noise planes, in which PCM cells are configured and operated to represent the nominal values of weights and to generate the required noise for sampling, respectively. Using experimentally observed PCM noise characteristics, for the exemplary Breast Cancer Dataset classification problem, we obtain hardware accuracy and expected calibration error matching that of an 8-bit fixed-point (FxP8) implementation, with projected savings of over 9$\times$ in terms of core area transistor count.