Large Language Models (LLMs) have revolutionized code generation ability by converting natural language descriptions into executable code. However, generating complex code within real-world scenarios remains challenging due to intricate structures, subtle bugs, understanding of advanced data types, and lack of supplementary contents. To address these challenges, we introduce the CONLINE framework, which enhances code generation by incorporating planned online searches for information retrieval and automated correctness testing for iterative refinement. CONLINE also serializes the complex inputs and outputs to improve comprehension and generate test case to ensure the framework's adaptability for real-world applications. CONLINE is validated through rigorous experiments on the DS-1000 and ClassEval datasets. It shows that CONLINE substantially improves the quality of complex code generation, highlighting its potential to enhance the practicality and reliability of LLMs in generating intricate code.
Rogue emitter detection (RED) is a crucial technique to maintain secure internet of things applications. Existing deep learning-based RED methods have been proposed under the friendly environments. However, these methods perform unstable under low signal-to-noise ratio (SNR) scenarios. To address this problem, we propose a robust RED method, which is a hybrid network of denoising autoencoder and deep metric learning (DML). Specifically, denoising autoencoder is adopted to mitigate noise interference and then improve its robustness under low SNR while DML plays an important role to improve the feature discrimination. Several typical experiments are conducted to evaluate the proposed RED method on an automatic dependent surveillance-Broadcast dataset and an IEEE 802.11 dataset and also to compare it with existing RED methods. Simulation results show that the proposed method achieves better RED performance and higher noise robustness with more discriminative semantic vectors than existing methods.
Specific emitter identification (SEI) plays an increasingly crucial and potential role in both military and civilian scenarios. It refers to a process to discriminate individual emitters from each other by analyzing extracted characteristics from given radio signals. Deep learning (DL) and deep neural networks (DNNs) can learn the hidden features of data and build the classifier automatically for decision making, which have been widely used in the SEI research. Considering the insufficiently labeled training samples and large unlabeled training samples, semi-supervised learning-based SEI (SS-SEI) methods have been proposed. However, there are few SS-SEI methods focusing on extracting the discriminative and generalized semantic features of radio signals. In this paper, we propose an SS-SEI method using metric-adversarial training (MAT). Specifically, pseudo labels are innovatively introduced into metric learning to enable semi-supervised metric learning (SSML), and an objective function alternatively regularized by SSML and virtual adversarial training (VAT) is designed to extract discriminative and generalized semantic features of radio signals. The proposed MAT-based SS-SEI method is evaluated on an open-source large-scale real-world automatic-dependent surveillance-broadcast (ADS-B) dataset and WiFi dataset and is compared with state-of-the-art methods. The simulation results show that the proposed method achieves better identification performance than existing state-of-the-art methods. Specifically, when the ratio of the number of labeled training samples to the number of all training samples is 10\%, the identification accuracy is 84.80\% under the ADS-B dataset and 80.70\% under the WiFi dataset. Our code can be downloaded from https://github.com/lovelymimola/MAT-based-SS-SEI.
Code contrastive pre-training has recently achieved significant progress on code-related tasks. In this paper, we present \textbf{SCodeR}, a \textbf{S}oft-labeled contrastive pre-training framework with two positive sample construction methods to learn functional-level \textbf{Code} \textbf{R}epresentation. Considering the relevance between codes in a large-scale code corpus, the soft-labeled contrastive pre-training can obtain fine-grained soft-labels through an iterative adversarial manner and use them to learn better code representation. The positive sample construction is another key for contrastive pre-training. Previous works use transformation-based methods like variable renaming to generate semantically equal positive codes. However, they usually result in the generated code with a highly similar surface form, and thus mislead the model to focus on superficial code structure instead of code semantics. To encourage SCodeR to capture semantic information from the code, we utilize code comments and abstract syntax sub-trees of the code to build positive samples. We conduct experiments on four code-related tasks over seven datasets. Extensive experimental results show that SCodeR achieves new state-of-the-art performance on all of them, which illustrates the effectiveness of the proposed pre-training method.
A time-frequency diagram is a commonly used visualization for observing the time-frequency distribution of radio signals and analyzing their time-varying patterns of communication states in radio monitoring and management. While it excels when performing short-term signal analyses, it becomes inadaptable for long-term signal analyses because it cannot adequately depict signal time-varying patterns in a large time span on a space-limited screen. This research thus presents an abstract signal time-frequency (ASTF) diagram to address this problem. In the diagram design, a visual abstraction method is proposed to visually encode signal communication state changes in time slices. A time segmentation algorithm is proposed to divide a large time span into time slices.Three new quantified metrics and a loss function are defined to ensure the preservation of important time-varying information in the time segmentation. An algorithm performance experiment and a user study are conducted to evaluate the effectiveness of the diagram for long-term signal analyses.
The potential advantages of intelligent wireless communications with millimeter wave (mmWave) and massive multiple-input multiple-output (MIMO) are all based on the availability of instantaneous channel state information (CSI) at the base station (BS). However, in frequency division duplex (FDD) systems, no existence of channel reciprocity leads to the difficult acquisition of accurate CSI at the BS. In recent years, many researchers explored effective architectures based on deep learning (DL) to solve this problem and proved the success of DL-based solutions. However, existing schemes focused on the acquisition of complete CSI while ignoring the beamforming and precoding operations. In this paper, we propose an intelligent channel feedback architecture designed for beamforming based on attention mechanism and eigen features. That is, we design an eigenmatrix and eigenvector feedback neural network, called EMEVNet. The key idea of EMEVNet is to extract and feedback effective information meeting the requirements of beamforming and precoding operations at the BS. With the help of the attention mechanism, the proposed EMEVNet can be considered as a dual channel auto-encoder, which is able to jointly encode the eigenmatrix and eigenvector into codewords. Hence, the EMEVNet consists of an encoder deployed at the user and the decoder at the BS. Each user first utilizes singular value decomposition (SVD) transformation to extract the eigen features from CSI, and then selects an appropriate encoder for a specific channel to generate feedback codewords.
Automatic modulation classification (AMC) is a key technique for desiging non-cooperative communication systems, and deep learning (DL) is applied effectively into AMC for improving the classification accuracy.However, most of the DL-based AMC methods have a large number of parameters and high computational complexity, and they cannot be directly applied into scenarios with limited computing power and storage space.In this paper, we propose a lightweight and low-complexity AMC method using ultra lite convolutional neural network (ULCNN), which is based on multiple tricks, including data augmentation, complex-valued convolution, separable convolution, channel attention, channel shuffle. Simulation results demonstrate that our proposed ULCNN-based AMC method achieves the average accuracy of 62.47\% on RML2016.10a and only 9,751 parameters. Moreover, ULCNN is verified on a typical edge device (Raspberry Pi), where interference time per sample is about 0.775 ms. The reproducible code can be download from GitHub at https://github.com/BeechburgPieStar/Ultra-Lite-Convolutional-Neural-Network-for-Automatic-Modulation-Classification .
Specific emitter identification (SEI) is a highly potential technology for physical layer authentication that is one of the most critical supplement for the upper-layer authentication. SEI is based on radio frequency (RF) features from circuit difference, rather than cryptography. These features are inherent characteristic of hardware circuits, which difficult to counterfeit. Recently, various deep learning (DL)-based conventional SEI methods have been proposed, and achieved advanced performances. However, these methods are proposed for close-set scenarios with massive RF signal samples for training, and they generally have poor performance under the condition of limited training samples. Thus, we focus on few-shot SEI (FS-SEI) for aircraft identification via automatic dependent surveillance-broadcast (ADS-B) signals, and a novel FS-SEI method is proposed, based on deep metric ensemble learning (DMEL). Specifically, the proposed method consists of feature embedding and classification. The former is based on metric learning with complex-valued convolutional neural network (CVCNN) for extracting discriminative features with compact intra-category distance and separable inter-category distance, while the latter is realized by an ensemble classifier. Simulation results show that if the number of samples per category is more than 5, the average accuracy of our proposed method is higher than 98\%. Moreover, feature visualization demonstrates the advantages of our proposed method in both discriminability and generalization. The codes of this paper can be downloaded from GitHub(https://github.com/BeechburgPieStar/Few-Shot-Specific-Emitter-Identification-via-Deep-Metric-Ensemble-Learning)
Neural networks have been widely applied in security applications such as spam and phishing detection, intrusion prevention, and malware detection. This black-box method, however, often has uncertainty and poor explainability in applications. Furthermore, neural networks themselves are often vulnerable to adversarial attacks. For those reasons, there is a high demand for trustworthy and rigorous methods to verify the robustness of neural network models. Adversarial robustness, which concerns the reliability of a neural network when dealing with maliciously manipulated inputs, is one of the hottest topics in security and machine learning. In this work, we survey existing literature in adversarial robustness verification for neural networks and collect 39 diversified research works across machine learning, security, and software engineering domains. We systematically analyze their approaches, including how robustness is formulated, what verification techniques are used, and the strengths and limitations of each technique. We provide a taxonomy from a formal verification perspective for a comprehensive understanding of this topic. We classify the existing techniques based on property specification, problem reduction, and reasoning strategies. We also demonstrate representative techniques that have been applied in existing studies with a sample model. Finally, we discuss open questions for future research.
Radio Frequency Fingerprint (RFF) identification on account of deep learning has the potential to enhance the security performance of wireless networks. Recently, several RFF datasets were proposed to satisfy requirements of large-scale datasets. However, most of these datasets are collected from 2.4G WiFi devices and through similar channel environments. Meanwhile, they only provided receiving data collected by the specific equipment. This paper utilizes software radio peripheral as a dataset generating platform. Therefore, the user can customize the parameters of the dataset, such as frequency band, modulation mode, antenna gain, and so on. In addition, the proposed dataset is generated through various and complex channel environments, which aims to better characterize the radio frequency signals in the real world. We collect the dataset at transmitters and receivers to simulate a real-world RFF dataset based on the long-term evolution (LTE). Furthermore, we verify the dataset and confirm its reliability. The dataset and reproducible code of this paper can be downloaded from GitHub link: https://github.com/njuptzsp/XSRPdataset.