Abstract:The Number Theoretic Transform (NTT) is an indispensable tool for computing efficient polynomial multiplications in post-quantum lattice-based cryptography. It has strong resemblance with the Fast Fourier Transform (FFT), which is the most widely used algorithm in digital signal processing. In this work, we demonstrate a unified hardware accelerator supporting both 512-point complex FFT as well as 256-point NTT for the recently standardized NIST post-quantum key encapsulation and digital signature algorithms ML-KEM and ML-DSA respectively. Our proposed architecture effectively utilizes the arithmetic circuitry required for complex FFT, and the only additional circuits required are for modular reduction along with modifications in the control logic. Our implementation achieves performance comparable to state-of-the-art ML-KEM / ML-DSA NTT accelerators on FPGA, thus demonstrating how an FFT accelerator can be augmented to support NTT and the unified hardware can be used for both digital signal processing and post-quantum lattice-based cryptography applications.
Abstract:With the recent advancements in machine learning theory, many commercial embedded micro-processors use neural network models for a variety of signal processing applications. However, their associated side-channel security vulnerabilities pose a major concern. There have been several proof-of-concept attacks demonstrating the extraction of their model parameters and input data. But, many of these attacks involve specific assumptions, have limited applicability, or pose huge overheads to the attacker. In this work, we study the side-channel vulnerabilities of embedded neural network implementations by recovering their parameters using timing-based information leakage and simple power analysis side-channel attacks. We demonstrate our attacks on popular micro-controller platforms over networks of different precisions such as floating point, fixed point, binary networks. We are able to successfully recover not only the model parameters but also the inputs for the above networks. Countermeasures against timing-based attacks are implemented and their overheads are analyzed.