Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Caiwen Ding

Katie

Towards Zero Memory Footprint Spiking Neural Network Training

Aug 16, 2023

Bin Lei, Sheng Lin, Pei-Hung Lin, Chunhua Liao, Caiwen Ding

Abstract:Biologically-inspired Spiking Neural Networks (SNNs), processing information using discrete-time events known as spikes rather than continuous values, have garnered significant attention due to their hardware-friendly and energy-efficient characteristics. However, the training of SNNs necessitates a considerably large memory footprint, given the additional storage requirements for spikes or events, leading to a complex structure and dynamic setup. In this paper, to address memory constraint in SNN training, we introduce an innovative framework, characterized by a remarkably low memory footprint. We \textbf{(i)} design a reversible SNN node that retains a high level of accuracy. Our design is able to achieve a $\mathbf{58.65\times}$ reduction in memory usage compared to the current SNN node. We \textbf{(ii)} propose a unique algorithm to streamline the backpropagation process of our reversible SNN node. This significantly trims the backward Floating Point Operations Per Second (FLOPs), thereby accelerating the training process in comparison to current reversible layer backpropagation method. By using our algorithm, the training time is able to be curtailed by $\mathbf{23.8\%}$ relative to existing reversible layer architectures.

Via

Access Paper or Ask Questions

Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought

Aug 16, 2023

Bin Lei, pei-Hung Lin, Chunhua Liao, Caiwen Ding

Abstract:Recent advancements in large-scale models, such as GPT-4, have showcased remarkable capabilities in addressing standard queries. However, when facing complex problems that require multi-step logical reasoning, their accuracy dramatically decreases. Current research has explored the realm of \textit{prompting engineering} to bolster the inferential capacities of these models. Our paper unveils a pioneering prompting technique, dubbed \textit{Graph of Thoughts (GoT)}. Through testing on a trio of escalating challenges: the 24-point game, resolution of high-degree polynomial equations, and derivation of formulas for recursive sequences, our method outperformed GPT-4, achieving accuracy improvements of $89.7\%$, $86\%$, and $56\%$ for each respective task. Moreover, when juxtaposed with the state-of-the-art (SOTA) prompting method, \textit{Tree of Thought (ToT)}, our approach registered an average accuracy boost of $23\%$, $24\%$, and $15\%$.

Via

Access Paper or Ask Questions

Tango: rethinking quantization for graph neural network training on GPUs

Aug 02, 2023

Shiyang Chen, Da Zheng, Caiwen Ding, Chengying Huan, Yuede Ji, Hang Liu

Figure 1 for Tango: rethinking quantization for graph neural network training on GPUs

Figure 2 for Tango: rethinking quantization for graph neural network training on GPUs

Figure 3 for Tango: rethinking quantization for graph neural network training on GPUs

Figure 4 for Tango: rethinking quantization for graph neural network training on GPUs

Abstract:Graph Neural Networks (GNNs) are becoming increasingly popular due to their superior performance in critical graph-related tasks. While quantization is widely used to accelerate GNN computation, quantized training faces unprecedented challenges. Current quantized GNN training systems often have longer training times than their full-precision counterparts for two reasons: (i) addressing the accuracy challenge leads to excessive overhead, and (ii) the optimization potential exposed by quantization is not adequately leveraged. This paper introduces Tango which re-thinks quantization challenges and opportunities for graph neural network training on GPUs with three contributions: Firstly, we introduce efficient rules to maintain accuracy during quantized GNN training. Secondly, we design and implement quantization-aware primitives and inter-primitive optimizations that can speed up GNN training. Finally, we integrate Tango with the popular Deep Graph Library (DGL) system and demonstrate its superior performance over state-of-the-art approaches on various GNN models and datasets.

Via

Access Paper or Ask Questions

Creating a Dataset for High-Performance Computing Code Translation: A Bridge Between HPC Fortran and C++

Jul 28, 2023

Bin Lei, Caiwen Ding, Le Chen, Pei-Hung Lin, Chunhua Liao

Figure 1 for Creating a Dataset for High-Performance Computing Code Translation: A Bridge Between HPC Fortran and C++

Figure 2 for Creating a Dataset for High-Performance Computing Code Translation: A Bridge Between HPC Fortran and C++

Figure 3 for Creating a Dataset for High-Performance Computing Code Translation: A Bridge Between HPC Fortran and C++

Figure 4 for Creating a Dataset for High-Performance Computing Code Translation: A Bridge Between HPC Fortran and C++

Abstract:In this study, we present a novel dataset for training machine learning models translating between OpenMP Fortran and C++ code. To ensure reliability and applicability, the dataset is initially refined using a meticulous code similarity test. The effectiveness of our dataset is assessed using both quantitative (CodeBLEU) and qualitative (human evaluation) methods. We demonstrate how this dataset can significantly improve the translation capabilities of large-scale language models, with improvements of $\mathbf{\times 5.1}$ for models with no prior coding knowledge and $\mathbf{\times 9.9}$ for models with some coding familiarity. Our work highlights the potential of this dataset to advance the field of code translation for high-performance computing. The dataset is available at https://github.com/bin123apple/Fortran-CPP-HPC-code-translation-dataset

Via

Access Paper or Ask Questions

Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering

Jul 25, 2023

Ce Feng, Nuo Xu, Wujie Wen, Parv Venkitasubramaniam, Caiwen Ding

Figure 1 for Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering

Figure 2 for Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering

Figure 3 for Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering

Figure 4 for Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering

Abstract:Differential privacy is a widely accepted measure of privacy in the context of deep learning algorithms, and achieving it relies on a noisy training approach known as differentially private stochastic gradient descent (DP-SGD). DP-SGD requires direct noise addition to every gradient in a dense neural network, the privacy is achieved at a significant utility cost. In this work, we present Spectral-DP, a new differentially private learning approach which combines gradient perturbation in the spectral domain with spectral filtering to achieve a desired privacy guarantee with a lower noise scale and thus better utility. We develop differentially private deep learning methods based on Spectral-DP for architectures that contain both convolution and fully connected layers. In particular, for fully connected layers, we combine a block-circulant based spatial restructuring with Spectral-DP to achieve better utility. Through comprehensive experiments, we study and provide guidelines to implement Spectral-DP deep learning on benchmark datasets. In comparison with state-of-the-art DP-SGD based approaches, Spectral-DP is shown to have uniformly better utility performance in both training from scratch and transfer learning settings.

* Accepted in 2023 IEEE Symposium on Security and Privacy (SP)

Via

Access Paper or Ask Questions

Dynamic Gradient Balancing for Enhanced Adversarial Attacks on Multi-Task Models

May 20, 2023

Lijun Zhang, Xiao Liu, Kaleel Mahmood, Caiwen Ding, Hui Guan

Abstract:Multi-task learning (MTL) creates a single machine learning model called multi-task model to simultaneously perform multiple tasks. Although the security of single task classifiers has been extensively studied, there are several critical security research questions for multi-task models including 1) How secure are multi-task models to single task adversarial machine learning attacks, 2) Can adversarial attacks be designed to attack multiple tasks simultaneously, and 3) Does task sharing and adversarial training increase multi-task model robustness to adversarial attacks? In this paper, we answer these questions through careful analysis and rigorous experimentation. First, we develop na\"ive adaptation of single-task white-box attacks and analyze their inherent drawbacks. We then propose a novel attack framework, Dynamic Gradient Balancing Attack (DGBA). Our framework poses the problem of attacking a multi-task model as an optimization problem based on averaged relative loss change, which can be solved by approximating the problem as an integer linear programming problem. Extensive evaluation on two popular MTL benchmarks, NYUv2 and Tiny-Taxonomy, demonstrates the effectiveness of DGBA compared to na\"ive multi-task attack baselines on both clean and adversarially trained multi-task models. The results also reveal a fundamental trade-off between improving task accuracy by sharing parameters across tasks and undermining model robustness due to increased attack transferability from parameter sharing.

* 19 pages, 5 figures

Via

Access Paper or Ask Questions

Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

Apr 24, 2023

Shaoyi Huang, Haowen Fang, Kaleel Mahmood, Bowen Lei, Nuo Xu, Bin Lei, Yue Sun, Dongkuan Xu, Wujie Wen, Caiwen Ding

Figure 1 for Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

Figure 2 for Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

Figure 3 for Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

Figure 4 for Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

Abstract:Biologically inspired Spiking Neural Networks (SNNs) have attracted significant attention for their ability to provide extremely energy-efficient machine intelligence through event-driven operation and sparse activities. As artificial intelligence (AI) becomes ever more democratized, there is an increasing need to execute SNN models on edge devices. Existing works adopt weight pruning to reduce SNN model size and accelerate inference. However, these methods mainly focus on how to obtain a sparse model for efficient inference, rather than training efficiency. To overcome these drawbacks, in this paper, we propose a Neurogenesis Dynamics-inspired Spiking Neural Network training acceleration framework, NDSNN. Our framework is computational efficient and trains a model from scratch with dynamic sparsity without sacrificing model fidelity. Specifically, we design a new drop-and-grow strategy with decreasing number of non-zero weights, to maintain extreme high sparsity and high accuracy. We evaluate NDSNN using VGG-16 and ResNet-19 on CIFAR-10, CIFAR-100 and TinyImageNet. Experimental results show that NDSNN achieves up to 20.52\% improvement in accuracy on Tiny-ImageNet using ResNet-19 (with a sparsity of 99\%) as compared to other SOTA methods (e.g., Lottery Ticket Hypothesis (LTH), SET-SNN, RigL-SNN). In addition, the training cost of NDSNN is only 40.89\% of the LTH training cost on ResNet-19 and 31.35\% of the LTH training cost on VGG-16 on CIFAR-10.

Via

Access Paper or Ask Questions

Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural Network Pruning

Apr 08, 2023

Shanglin Zhou, Mikhail A. Bragin, Lynn Pepin, Deniz Gurevin, Fei Miao, Caiwen Ding

Abstract:Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks. However, the typical three-stage pipeline significantly increases the overall training time. In this paper, we develop a systematic weight-pruning optimization approach based on Surrogate Lagrangian relaxation, which is tailored to overcome difficulties caused by the discrete nature of the weight-pruning problem. We prove that our method ensures fast convergence of the model compression problem, and the convergence of the SLR is accelerated by using quadratic penalties. Model parameters obtained by SLR during the training phase are much closer to their optimal values as compared to those obtained by other state-of-the-art methods. We evaluate our method on image classification tasks using CIFAR-10 and ImageNet with state-of-the-art MLP-Mixer, Swin Transformer, and VGG-16, ResNet-18, ResNet-50 and ResNet-110, MobileNetV2. We also evaluate object detection and segmentation tasks on COCO, KITTI benchmark, and TuSimple lane detection dataset using a variety of models. Experimental results demonstrate that our SLR-based weight-pruning optimization approach achieves a higher compression rate than state-of-the-art methods under the same accuracy requirement and also can achieve higher accuracy under the same compression rate requirement. Under classification tasks, our SLR approach converges to the desired accuracy $3\times$ faster on both of the datasets. Under object detection and segmentation tasks, SLR also converges $2\times$ faster to the desired accuracy. Further, our SLR achieves high model accuracy even at the hard-pruning stage without retraining, which reduces the traditional three-stage pruning into a two-stage process. Given a limited budget of retraining epochs, our approach quickly recovers the model's accuracy.

* arXiv admin note: text overlap with arXiv:2012.10079

Via

Access Paper or Ask Questions

Physics-aware Roughness Optimization for Diffractive Optical Neural Networks

Apr 04, 2023

Shanglin Zhou, Yingjie Li, Minhan Lou, Weilu Gao, Zhijie Shi, Cunxi Yu, Caiwen Ding

Abstract:As a representative next-generation device/circuit technology beyond CMOS, diffractive optical neural networks (DONNs) have shown promising advantages over conventional deep neural networks due to extreme fast computation speed (light speed) and low energy consumption. However, there is a mismatch, i.e., significant prediction accuracy loss, between the DONN numerical modelling and physical optical device deployment, because of the interpixel interaction within the diffractive layers. In this work, we propose a physics-aware diffractive optical neural network training framework to reduce the performance difference between numerical modeling and practical deployment. Specifically, we propose the roughness modeling regularization in the training process and integrate the physics-aware sparsification method to introduce sparsity to the phase masks to reduce sharp phase changes between adjacent pixels in diffractive layers. We further develop $2\pi$ periodic optimization to reduce the roughness of the phase masks to preserve the performance of DONN. Experiment results demonstrate that, compared to state-of-the-arts, our physics-aware optimization can provide $35.7\%$, $34.2\%$, $28.1\%$, and $27.3\%$ reduction in roughness with only accuracy loss on MNIST, FMNIST, KMNIST, and EMNIST, respectively.

* This paper is accepted by the Design Automation Conference (DAC), 2023

Via

Access Paper or Ask Questions

Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation

Mar 25, 2023

Sanbao Su, Songyang Han, Yiming Li, Zhili Zhang, Chen Feng, Caiwen Ding, Fei Miao

Figure 1 for Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation

Figure 2 for Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation

Figure 3 for Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation

Figure 4 for Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation

Abstract:Object detection and multiple object tracking (MOT) are essential components of self-driving systems. Accurate detection and uncertainty quantification are both critical for onboard modules, such as perception, prediction, and planning, to improve the safety and robustness of autonomous vehicles. Collaborative object detection (COD) has been proposed to improve detection accuracy and reduce uncertainty by leveraging the viewpoints of multiple agents. However, little attention has been paid on how to leverage the uncertainty quantification from COD to enhance MOT performance. In this paper, as the first attempt, we design the uncertainty propagation framework to address this challenge, called MOT-CUP. Our framework first quantifies the uncertainty of COD through direct modeling and conformal prediction, and propogates this uncertainty information during the motion prediction and association steps. MOT-CUP is designed to work with different collaborative object detectors and baseline MOT algorithms. We evaluate MOT-CUP on V2X-Sim, a comprehensive collaborative perception dataset, and demonstrate a 2% improvement in accuracy and a 2.67X reduction in uncertainty compared to the baselines, e.g., SORT and ByteTrack. MOT-CUP demonstrates the importance of uncertainty quantification in both COD and MOT, and provides the first attempt to improve the accuracy and reduce the uncertainty in MOT based on COD through uncertainty propogation.

Via

Access Paper or Ask Questions