Existing datasets used to train deep learning models for narrowband radio frequency (RF) signal classification lack enough diversity in signal types and channel impairments to sufficiently assess model performance in the real world. We introduce the Sig53 dataset consisting of 5 million synthetically-generated samples from 53 different signal classes and expertly chosen impairments. We also introduce TorchSig, a signals processing machine learning toolkit that can be used to generate this dataset. TorchSig incorporates data handling principles that are common to the vision domain, and it is meant to serve as an open-source foundation for future signals machine learning research. Initial experiments using the Sig53 dataset are conducted using state of the art (SoTA) convolutional neural networks (ConvNets) and Transformers. These experiments reveal Transformers outperform ConvNets without the need for additional regularization or a ConvNet teacher, which is contrary to results from the vision domain. Additional experiments demonstrate that TorchSig's domain-specific data augmentations facilitate model training, which ultimately benefits model performance. Finally, TorchSig supports on-the-fly synthetic data creation at training time, thus enabling massive scale training sessions with virtually unlimited datasets.
We present a new RF fingerprinting technique for wireless emitters that is based on a simple, easily and efficiently retrainable Ridge Regression (RR) classifier. The RR learns to identify devices using bursts of waveform samples, conveniently transformed and preprocessed by delay-loop reservoirs. Deep delay Loop Reservoir Computing (DLR) is our processing architecture that supports general machine learning algorithms on resource-constrained devices by leveraging delay-loop reservoir computing (RC) and innovative architectures of loop trees. In prior work, we trained and evaluated DLR using high SNR device emissions in clean channels. We here demonstrate how to use DLR for IoT authentication by performing RF-based Specific Emitter Identification (SEI), even in the presence of fading channels and heavy in-band jamming by leveraging a matched filter (MF) extension, dubbed MF-DLR. We show that the MF processing improves the SEI performance of RR without the RC transformation (MF-RR), but the MF-DLR is more robust and applicable for addressing signatures beyond waveform transients (e.g. turn-on).
* arXiv admin note: text overlap with arXiv:2104.00751
We introduce a novel design for in-situ training of machine learning algorithms built into smart sensors, and illustrate distributed training scenarios using radio frequency (RF) spectrum sensors. Current RF sensors at the Edge lack the computational resources to support practical, in-situ training for intelligent signal classification. We propose a solution using Deepdelay Loop Reservoir Computing (DLR), a processing architecture that supports machine learning algorithms on resource-constrained edge-devices by leveraging delayloop reservoir computing in combination with innovative hardware. DLR delivers reductions in form factor, hardware complexity and latency, compared to the State-ofthe- Art (SoA) neural nets. We demonstrate DLR for two applications: RF Specific Emitter Identification (SEI) and wireless protocol recognition. DLR enables mobile edge platforms to authenticate and then track emitters with fast SEI retraining. Once delay loops separate the data classes, traditionally complex, power-hungry classification models are no longer needed for the learning process. Yet, even with simple classifiers such as Ridge Regression (RR), the complexity grows at least quadratically with the input size. DLR with a RR classifier exceeds the SoA accuracy, while further reducing power consumption by leveraging the architecture of parallel (split) loops. To authenticate mobile devices across large regions, DLR can be trained in a distributed fashion with very little additional processing and a small communication cost, all while maintaining accuracy. We illustrate how to merge locally trained DLR classifiers in use cases of interest.
Current AI systems at the tactical edge lack the computational resources to support in-situ training and inference for situational awareness, and it is not always practical to leverage backhaul resources due to security, bandwidth, and mission latency requirements. We propose a solution through Deep delay Loop Reservoir Computing (DLR), a processing architecture supporting general machine learning algorithms on compact mobile devices by leveraging delay-loop (DL) reservoir computing in combination with innovative photonic hardware exploiting the inherent speed, and spatial, temporal and wavelength-based processing diversity of signals in the optical domain. DLR delivers reductions in form factor, hardware complexity, power consumption and latency, compared to State-of-the-Art . DLR can be implemented with a single photonic DL and a few electro-optical components. In certain cases multiple DL layers increase learning capacity of the DLR with no added latency. We demonstrate the advantages of DLR on the application of RF Specific Emitter Identification.
* 6 pages, appeared in GOMACTech 2020, released for public
We show that compact fully connected (FC) deep learning networks trained to classify wireless protocols using a hierarchy of multiple denoising autoencoders (AEs) outperform reference FC networks trained in a typical way, i.e., with a stochastic gradient based optimization of a given FC architecture. Not only is the complexity of such FC network, measured in number of trainable parameters and scalar multiplications, much lower than the reference FC and residual models, its accuracy also outperforms both models for nearly all tested SNR values (0 dB to 50dB). Such AE-trained networks are suited for in-situ protocol inference performed by simple mobile devices based on noisy signal measurements. Training is based on the data transmitted by real devices, and collected in a controlled environment, and systematically augmented by a policy-based data synthesis process by adding to the signal any subset of impairments commonly seen in a wireless receiver.
Adversarial examples in machine learning for images are widely publicized and explored. Illustrations of misclassifications caused by slightly perturbed inputs are abundant and commonly known (e.g., a picture of panda imperceptibly perturbed to fool the classifier into incorrectly labeling it as a gibbon). Similar attacks on deep learning (DL) for radio frequency (RF) signals and their mitigation strategies are scarcely addressed in the published work. Yet, RF adversarial examples (AdExs) with minimal waveform perturbations can cause drastic, targeted misclassification results, particularly against spectrum sensing/survey applications (e.g. BPSK is mistaken for 8-PSK). Our research on deep learning AdExs and proposed defense mechanisms are RF-centric, and incorporate physical world, over-the-air (OTA) effects. We herein present defense mechanisms based on pre-training the target classifier using an autoencoder. Our results validate this approach as a viable mitigation method to subvert adversarial attacks against deep learning-based communications and radar sensing systems.
* arXiv admin note: substantial text overlap with arXiv:1902.06044
While research on adversarial examples in machine learning for images has been prolific, similar attacks on deep learning (DL) for radio frequency (RF) signals and their mitigation strategies are scarcely addressed in the published work, with only one recent publication in the RF domain . RF adversarial examples (AdExs) can cause drastic, targeted misclassification results mostly in spectrum sensing/ survey applications (e.g. BPSK mistaken for 8-PSK) with minimal waveform perturbation. It is not clear if the RF AdExs maintain their effects in the physical world, i.e., when AdExs are delivered over-the-air (OTA). Our research on deep learning AdExs and proposed defense mechanisms are RF-centric, and incorporate physical world, OTA effects. We here present defense mechanisms based on statistical tests. One test to detect AdExs utilizes Peak-to- Average-Power-Ratio (PAPR) of the DL data points delivered OTA, while another statistical test uses the Softmax outputs of the DL classifier, which corresponds to the probabilities the classifier assigns to each of the trained classes. The former test leverages the RF nature of the data, and the latter is universally applicable to AdExs regardless of their origin. Both solutions are shown as viable mitigation methods to subvert adversarial attacks against communications and radar sensing systems.
We propose a method for estimating channel parameters from RSSI measurements and the lost packet count, which can work in the presence of losses due to both interference and signal attenuation below the noise floor. This is especially important in the wireless networks, such as vehicular, where propagation model changes with the density of nodes. The method is based on Stochastic Expectation Maximization, where the received data is modeled as a mixture of distributions (no/low interference and strong interference), incomplete (censored) due to packet losses. The PDFs in the mixture are Gamma, according to the commonly accepted model for wireless signal and interference power. This approach leverages the loss count as additional information, hence outperforming maximum likelihood estimation, which does not use this information (ML-), for a small number of received RSSI samples. Hence, it allows inexpensive on-line channel estimation from ad-hoc collected data. The method also outperforms ML- on uncensored data mixtures, as ML- assumes that samples are from a single-mode PDF.