Alert button
Picture for Priyadarshini Panda

Priyadarshini Panda

Alert button

Are SNNs Truly Energy-efficient? $-$ A Hardware Perspective

Sep 06, 2023
Abhiroop Bhattacharjee, Ruokai Yin, Abhishek Moitra, Priyadarshini Panda

Spiking Neural Networks (SNNs) have gained attention for their energy-efficient machine learning capabilities, utilizing bio-inspired activation functions and sparse binary spike-data representations. While recent SNN algorithmic advances achieve high accuracy on large-scale computer vision tasks, their energy-efficiency claims rely on certain impractical estimation metrics. This work studies two hardware benchmarking platforms for large-scale SNN inference, namely SATA and SpikeSim. SATA is a sparsity-aware systolic-array accelerator, while SpikeSim evaluates SNNs implemented on In-Memory Computing (IMC) based analog crossbars. Using these tools, we find that the actual energy-efficiency improvements of recent SNN algorithmic works differ significantly from their estimated values due to various hardware bottlenecks. We identify and address key roadblocks to efficient SNN deployment on hardware, including repeated computations & data movements over timesteps, neuronal module overhead, and vulnerability of SNNs towards crossbar non-idealities.

* 5 pages 
Viaarxiv icon

Artificial to Spiking Neural Networks Conversion for Scientific Machine Learning

Aug 31, 2023
Qian Zhang, Chenxi Wu, Adar Kahana, Youngeun Kim, Yuhang Li, George Em Karniadakis, Priyadarshini Panda

Figure 1 for Artificial to Spiking Neural Networks Conversion for Scientific Machine Learning
Figure 2 for Artificial to Spiking Neural Networks Conversion for Scientific Machine Learning
Figure 3 for Artificial to Spiking Neural Networks Conversion for Scientific Machine Learning
Figure 4 for Artificial to Spiking Neural Networks Conversion for Scientific Machine Learning

We introduce a method to convert Physics-Informed Neural Networks (PINNs), commonly used in scientific machine learning, to Spiking Neural Networks (SNNs), which are expected to have higher energy efficiency compared to traditional Artificial Neural Networks (ANNs). We first extend the calibration technique of SNNs to arbitrary activation functions beyond ReLU, making it more versatile, and we prove a theorem that ensures the effectiveness of the calibration. We successfully convert PINNs to SNNs, enabling computational efficiency for diverse regression tasks in solving multiple differential equations, including the unsteady Navier-Stokes equations. We demonstrate great gains in terms of overall efficiency, including Separable PINNs (SPINNs), which accelerate the training process. Overall, this is the first work of this kind and the proposed method achieves relatively good accuracy with low spike rates.

Viaarxiv icon

Examining the Role and Limits of Batchnorm Optimization to Mitigate Diverse Hardware-noise in In-memory Computing

May 28, 2023
Abhiroop Bhattacharjee, Abhishek Moitra, Youngeun Kim, Yeshwanth Venkatesha, Priyadarshini Panda

Figure 1 for Examining the Role and Limits of Batchnorm Optimization to Mitigate Diverse Hardware-noise in In-memory Computing
Figure 2 for Examining the Role and Limits of Batchnorm Optimization to Mitigate Diverse Hardware-noise in In-memory Computing
Figure 3 for Examining the Role and Limits of Batchnorm Optimization to Mitigate Diverse Hardware-noise in In-memory Computing
Figure 4 for Examining the Role and Limits of Batchnorm Optimization to Mitigate Diverse Hardware-noise in In-memory Computing

In-Memory Computing (IMC) platforms such as analog crossbars are gaining focus as they facilitate the acceleration of low-precision Deep Neural Networks (DNNs) with high area- & compute-efficiencies. However, the intrinsic non-idealities in crossbars, which are often non-deterministic and non-linear, degrade the performance of the deployed DNNs. In addition to quantization errors, most frequently encountered non-idealities during inference include crossbar circuit-level parasitic resistances and device-level non-idealities such as stochastic read noise and temporal drift. In this work, our goal is to closely examine the distortions caused by these non-idealities on the dot-product operations in analog crossbars and explore the feasibility of a nearly training-less solution via crossbar-aware fine-tuning of batchnorm parameters in real-time to mitigate the impact of the non-idealities. This enables reduction in hardware costs in terms of memory and training energy for IMC noise-aware retraining of the DNN weights on crossbars.

* Great Lakes Symposium on VLSI 2023 (GLSVLSI 2023) conference  
* Accepted in Great Lakes Symposium on VLSI 2023 (GLSVLSI 2023) conference 
Viaarxiv icon

Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient In-Memory Computing

May 27, 2023
Yuhang Li, Abhishek Moitra, Tamar Geller, Priyadarshini Panda

Figure 1 for Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient In-Memory Computing
Figure 2 for Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient In-Memory Computing
Figure 3 for Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient In-Memory Computing
Figure 4 for Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient In-Memory Computing

Spiking Neural Networks (SNNs) have recently attracted widespread research interest as an efficient alternative to traditional Artificial Neural Networks (ANNs) because of their capability to process sparse and binary spike information and avoid expensive multiplication operations. Although the efficiency of SNNs can be realized on the In-Memory Computing (IMC) architecture, we show that the energy cost and latency of SNNs scale linearly with the number of timesteps used on IMC hardware. Therefore, in order to maximize the efficiency of SNNs, we propose input-aware Dynamic Timestep SNN (DT-SNN), a novel algorithmic solution to dynamically determine the number of timesteps during inference on an input-dependent basis. By calculating the entropy of the accumulated output after each timestep, we can compare it to a predefined threshold and decide if the information processed at the current timestep is sufficient for a confident prediction. We deploy DT-SNN on an IMC architecture and show that it incurs negligible computational overhead. We demonstrate that our method only uses 1.46 average timesteps to achieve the accuracy of a 4-timestep static SNN while reducing the energy-delay-product by 80%.

* Published at Design & Automation Conferences (DAC) 2023 
Viaarxiv icon

Sharing Leaky-Integrate-and-Fire Neurons for Memory-Efficient Spiking Neural Networks

May 26, 2023
Youngeun Kim, Yuhang Li, Abhishek Moitra, Ruokai Yin, Priyadarshini Panda

Figure 1 for Sharing Leaky-Integrate-and-Fire Neurons for Memory-Efficient Spiking Neural Networks
Figure 2 for Sharing Leaky-Integrate-and-Fire Neurons for Memory-Efficient Spiking Neural Networks
Figure 3 for Sharing Leaky-Integrate-and-Fire Neurons for Memory-Efficient Spiking Neural Networks
Figure 4 for Sharing Leaky-Integrate-and-Fire Neurons for Memory-Efficient Spiking Neural Networks

Spiking Neural Networks (SNNs) have gained increasing attention as energy-efficient neural networks owing to their binary and asynchronous computation. However, their non-linear activation, that is Leaky-Integrate-and-Fire (LIF) neuron, requires additional memory to store a membrane voltage to capture the temporal dynamics of spikes. Although the required memory cost for LIF neurons significantly increases as the input dimension goes larger, a technique to reduce memory for LIF neurons has not been explored so far. To address this, we propose a simple and effective solution, EfficientLIF-Net, which shares the LIF neurons across different layers and channels. Our EfficientLIF-Net achieves comparable accuracy with the standard SNNs while bringing up to ~4.3X forward memory efficiency and ~21.9X backward memory efficiency for LIF neurons. We conduct experiments on various datasets including CIFAR10, CIFAR100, TinyImageNet, ImageNet-100, and N-Caltech101. Furthermore, we show that our approach also offers advantages on Human Activity Recognition (HAR) datasets, which heavily rely on temporal information.

Viaarxiv icon

Do We Really Need a Large Number of Visual Prompts?

May 26, 2023
Youngeun Kim, Yuhang Li, Abhishek Moitra, Priyadarshini Panda

Figure 1 for Do We Really Need a Large Number of Visual Prompts?
Figure 2 for Do We Really Need a Large Number of Visual Prompts?
Figure 3 for Do We Really Need a Large Number of Visual Prompts?
Figure 4 for Do We Really Need a Large Number of Visual Prompts?

Due to increasing interest in adapting models on resource-constrained edges, parameter-efficient transfer learning has been widely explored. Among various methods, Visual Prompt Tuning (VPT), prepending learnable prompts to input space, shows competitive fine-tuning performance compared to training of full network parameters. However, VPT increases the number of input tokens, resulting in additional computational overhead. In this paper, we analyze the impact of the number of prompts on fine-tuning performance and self-attention operation in a vision transformer architecture. Through theoretical and empirical analysis we show that adding more prompts does not lead to linear performance improvement. Further, we propose a Prompt Condensation (PC) technique that aims to prevent performance degradation from using a small number of prompts. We validate our methods on FGVC and VTAB-1k tasks and show that our approach reduces the number of prompts by ~70% while maintaining accuracy.

Viaarxiv icon

MINT: Multiplier-less Integer Quantization for Spiking Neural Networks

May 20, 2023
Ruokai Yin, Yuhang Li, Abhishek Moitra, Priyadarshini Panda

Figure 1 for MINT: Multiplier-less Integer Quantization for Spiking Neural Networks
Figure 2 for MINT: Multiplier-less Integer Quantization for Spiking Neural Networks
Figure 3 for MINT: Multiplier-less Integer Quantization for Spiking Neural Networks
Figure 4 for MINT: Multiplier-less Integer Quantization for Spiking Neural Networks

We propose Multiplier-less INTeger (MINT) quantization, an efficient uniform quantization scheme for the weights and membrane potentials in spiking neural networks (SNNs). Unlike prior SNN quantization works, MINT quantizes the memory-hungry membrane potentials to extremely low bit-width (2-bit) to significantly reduce the total memory footprint. Additionally, MINT quantization shares the quantization scale between the weights and membrane potentials, eliminating the need for multipliers and floating arithmetic units, which are required by the standard uniform quantization. Experimental results demonstrate that our proposed method achieves accuracy that matches other state-of-the-art SNN quantization works while outperforming them on total memory footprint and hardware cost at deployment time. For instance, 2-bit MINT VGG-16 achieves 48.6% accuracy on TinyImageNet (0.28% better than the full-precision baseline) with approximately 93.8% reduction in total memory footprint from the full-precision model; meanwhile, our model reduces area by 93% and dynamic power by 98% compared to other SNN quantization counterparts.

* 11 pages 
Viaarxiv icon

Divide-and-Conquer the NAS puzzle in Resource Constrained Federated Learning Systems

May 11, 2023
Yeshwanth Venkatesha, Youngeun Kim, Hyoungseob Park, Priyadarshini Panda

Figure 1 for Divide-and-Conquer the NAS puzzle in Resource Constrained Federated Learning Systems
Figure 2 for Divide-and-Conquer the NAS puzzle in Resource Constrained Federated Learning Systems
Figure 3 for Divide-and-Conquer the NAS puzzle in Resource Constrained Federated Learning Systems
Figure 4 for Divide-and-Conquer the NAS puzzle in Resource Constrained Federated Learning Systems

Federated Learning (FL) is a privacy-preserving distributed machine learning approach geared towards applications in edge devices. However, the problem of designing custom neural architectures in federated environments is not tackled from the perspective of overall system efficiency. In this paper, we propose DC-NAS -- a divide-and-conquer approach that performs supernet-based Neural Architecture Search (NAS) in a federated system by systematically sampling the search space. We propose a novel diversified sampling strategy that balances exploration and exploitation of the search space by initially maximizing the distance between the samples and progressively shrinking this distance as the training progresses. We then perform channel pruning to reduce the training complexity at the devices further. We show that our approach outperforms several sampling strategies including Hadamard sampling, where the samples are maximally separated. We evaluate our method on the CIFAR10, CIFAR100, EMNIST, and TinyImagenet benchmarks and show a comprehensive analysis of different aspects of federated learning such as scalability, and non-IID data. DC-NAS achieves near iso-accuracy as compared to full-scale federated NAS with 50% fewer resources.

Viaarxiv icon

Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient

Apr 25, 2023
Yuhang Li, Youngeun Kim, Hyoungseob Park, Priyadarshini Panda

Figure 1 for Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient
Figure 2 for Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient
Figure 3 for Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient
Figure 4 for Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient

Spiking Neural Networks (SNNs) are recognized as the candidate for the next-generation neural networks due to their bio-plausibility and energy efficiency. Recently, researchers have demonstrated that SNNs are able to achieve nearly state-of-the-art performance in image recognition tasks using surrogate gradient training. However, some essential questions exist pertaining to SNNs that are little studied: Do SNNs trained with surrogate gradient learn different representations from traditional Artificial Neural Networks (ANNs)? Does the time dimension in SNNs provide unique representation power? In this paper, we aim to answer these questions by conducting a representation similarity analysis between SNNs and ANNs using Centered Kernel Alignment (CKA). We start by analyzing the spatial dimension of the networks, including both the width and the depth. Furthermore, our analysis of residual connections shows that SNNs learn a periodic pattern, which rectifies the representations in SNNs to be ANN-like. We additionally investigate the effect of the time dimension on SNN representation, finding that deeper layers encourage more dynamics along the time dimension. We also investigate the impact of input data such as event-stream data and adversarial attacks. Our work uncovers a host of new findings of representations in SNNs. We hope this work will inspire future research to fully comprehend the representation power of SNNs. Code is released at https://github.com/Intelligent-Computing-Lab-Yale/SNNCKA.

* Published in Transactions on Machine Learning Research (TMLR) 
Viaarxiv icon

NeuroBench: Advancing Neuromorphic Computing through Collaborative, Fair and Representative Benchmarking

Apr 15, 2023
Jason Yik, Soikat Hasan Ahmed, Zergham Ahmed, Brian Anderson, Andreas G. Andreou, Chiara Bartolozzi, Arindam Basu, Douwe den Blanken, Petrut Bogdan, Sander Bohte, Younes Bouhadjar, Sonia Buckley, Gert Cauwenberghs, Federico Corradi, Guido de Croon, Andreea Danielescu, Anurag Daram, Mike Davies, Yigit Demirag, Jason Eshraghian, Jeremy Forest, Steve Furber, Michael Furlong, Aditya Gilra, Giacomo Indiveri, Siddharth Joshi, Vedant Karia, Lyes Khacef, James C. Knight, Laura Kriener, Rajkumar Kubendran, Dhireesha Kudithipudi, Gregor Lenz, Rajit Manohar, Christian Mayr, Konstantinos Michmizos, Dylan Muir, Emre Neftci, Thomas Nowotny, Fabrizio Ottati, Ayca Ozcelikkale, Noah Pacik-Nelson, Priyadarshini Panda, Sun Pao-Sheng, Melika Payvand, Christian Pehle, Mihai A. Petrovici, Christoph Posch, Alpha Renner, Yulia Sandamirskaya, Clemens JS Schaefer, André van Schaik, Johannes Schemmel, Catherine Schuman, Jae-sun Seo, Sadique Sheik, Sumit Bam Shrestha, Manolis Sifalakis, Amos Sironi, Kenneth Stewart, Terrence C. Stewart, Philipp Stratmann, Guangzhi Tang, Jonathan Timcheck, Marian Verhelst, Craig M. Vineyard, Bernhard Vogginger, Amirreza Yousefzadeh, Biyan Zhou, Fatima Tuz Zohora, Charlotte Frenkel, Vijay Janapa Reddi

Figure 1 for NeuroBench: Advancing Neuromorphic Computing through Collaborative, Fair and Representative Benchmarking

The field of neuromorphic computing holds great promise in terms of advancing computing efficiency and capabilities by following brain-inspired principles. However, the rich diversity of techniques employed in neuromorphic research has resulted in a lack of clear standards for benchmarking, hindering effective evaluation of the advantages and strengths of neuromorphic methods compared to traditional deep-learning-based methods. This paper presents a collaborative effort, bringing together members from academia and the industry, to define benchmarks for neuromorphic computing: NeuroBench. The goals of NeuroBench are to be a collaborative, fair, and representative benchmark suite developed by the community, for the community. In this paper, we discuss the challenges associated with benchmarking neuromorphic solutions, and outline the key features of NeuroBench. We believe that NeuroBench will be a significant step towards defining standards that can unify the goals of neuromorphic computing and drive its technological progress. Please visit neurobench.ai for the latest updates on the benchmark tasks and metrics.

Viaarxiv icon