Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qi Xuan

MDM: Advancing Multi-Domain Distribution Matching for Automatic Modulation Recognition Dataset Synthesis

Aug 05, 2024

Dongwei Xu, Jiajun Chen, Yao Lu, Tianhao Xia, Qi Xuan, Wei Wang, Yun Lin, Xiaoniu Yang

Abstract:Recently, deep learning technology has been successfully introduced into Automatic Modulation Recognition (AMR) tasks. However, the success of deep learning is all attributed to the training on large-scale datasets. Such a large amount of data brings huge pressure on storage, transmission and model training. In order to solve the problem of large amount of data, some researchers put forward the method of data distillation, which aims to compress large training data into smaller synthetic datasets to maintain its performance. While numerous data distillation techniques have been developed within the realm of image processing, the unique characteristics of signals set them apart. Signals exhibit distinct features across various domains, necessitating specialized approaches for their analysis and processing. To this end, a novel dataset distillation method--Multi-domain Distribution Matching (MDM) is proposed. MDM employs the Discrete Fourier Transform (DFT) to translate timedomain signals into the frequency domain, and then uses a model to compute distribution matching losses between the synthetic and real datasets, considering both the time and frequency domains. Ultimately, these two losses are integrated to update the synthetic dataset. We conduct extensive experiments on three AMR datasets. Experimental results show that, compared with baseline methods, our method achieves better performance under the same compression ratio. Furthermore, we conduct crossarchitecture generalization experiments on several models, and the experimental results show that our synthetic datasets can generalize well on other unseen models.

Via

Access Paper or Ask Questions

Enhancing Ethereum Fraud Detection via Generative and Contrastive Self-supervision

Aug 01, 2024

Chenxiang Jin, Jiajun Zhou, Chenxuan Xie, Shanqing Yu, Qi Xuan, Xiaoniu Yang

Figure 1 for Enhancing Ethereum Fraud Detection via Generative and Contrastive Self-supervision

Figure 2 for Enhancing Ethereum Fraud Detection via Generative and Contrastive Self-supervision

Figure 3 for Enhancing Ethereum Fraud Detection via Generative and Contrastive Self-supervision

Figure 4 for Enhancing Ethereum Fraud Detection via Generative and Contrastive Self-supervision

Abstract:The rampant fraudulent activities on Ethereum hinder the healthy development of the blockchain ecosystem, necessitating the reinforcement of regulations. However, multiple imbalances involving account interaction frequencies and interaction types in the Ethereum transaction environment pose significant challenges to data mining-based fraud detection research. To address this, we first propose the concept of meta-interactions to refine interaction behaviors in Ethereum, and based on this, we present a dual self-supervision enhanced Ethereum fraud detection framework, named Meta-IFD. This framework initially introduces a generative self-supervision mechanism to augment the interaction features of accounts, followed by a contrastive self-supervision mechanism to differentiate various behavior patterns, and ultimately characterizes the behavioral representations of accounts and mines potential fraud risks through multi-view interaction feature learning. Extensive experiments on real Ethereum datasets demonstrate the effectiveness and superiority of our framework in detecting common Ethereum fraud behaviors such as Ponzi schemes and phishing scams. Additionally, the generative module can effectively alleviate the interaction distribution imbalance in Ethereum data, while the contrastive module significantly enhances the framework's ability to distinguish different behavior patterns. The source code will be released on GitHub soon.

Via

Access Paper or Ask Questions

Dual-view Aware Smart Contract Vulnerability Detection for Ethereum

Jun 29, 2024

Jiacheng Yao, Maolin Wang, Wanqi Chen, Chengxiang Jin, Jiajun Zhou, Shanqing Yu, Qi Xuan

Figure 1 for Dual-view Aware Smart Contract Vulnerability Detection for Ethereum

Figure 2 for Dual-view Aware Smart Contract Vulnerability Detection for Ethereum

Figure 3 for Dual-view Aware Smart Contract Vulnerability Detection for Ethereum

Figure 4 for Dual-view Aware Smart Contract Vulnerability Detection for Ethereum

Abstract:The wide application of Ethereum technology has brought technological innovation to traditional industries. As one of Ethereum's core applications, smart contracts utilize diverse contract codes to meet various functional needs and have gained widespread use. However, the non-tamperability of smart contracts, coupled with vulnerabilities caused by natural flaws or human errors, has brought unprecedented challenges to blockchain security. Therefore, in order to ensure the healthy development of blockchain technology and the stability of the blockchain community, it is particularly important to study the vulnerability detection techniques for smart contracts. In this paper, we propose a Dual-view Aware Smart Contract Vulnerability Detection Framework named DVDet. The framework initially converts the source code and bytecode of smart contracts into weighted graphs and control flow sequences, capturing potential risk features from these two perspectives and integrating them for analysis, ultimately achieving effective contract vulnerability detection. Comprehensive experiments on the Ethereum dataset show that our method outperforms others in detecting vulnerabilities.

* Accepted by International Conference on Blockchain and Trustworthy Systems 2024

Via

Access Paper or Ask Questions

A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models

Jun 12, 2024

Yao Lu, Yutao Zhu, Yuqi Li, Dongwei Xu, Yun Lin, Qi Xuan, Xiaoniu Yang

Figure 1 for A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models

Figure 2 for A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models

Figure 3 for A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models

Figure 4 for A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models

Abstract:With the successful application of deep learning in communications systems, deep neural networks are becoming the preferred method for signal classification. Although these models yield impressive results, they often come with high computational complexity and large model sizes, which hinders their practical deployment in communication systems. To address this challenge, we propose a novel layer pruning method. Specifically, we decompose the model into several consecutive blocks, each containing consecutive layers with similar semantics. Then, we identify layers that need to be preserved within each block based on their contribution. Finally, we reassemble the pruned blocks and fine-tune the compact model. Extensive experiments on five datasets demonstrate the efficiency and effectiveness of our method over a variety of state-of-the-art baselines, including layer pruning and channel pruning methods.

Via

Access Paper or Ask Questions

Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

May 24, 2024

Zeyu Wang, Tianyi Jiang, Yao Lu, Xiaoze Bao, Shanqing Yu, Bin Wei, Qi Xuan

Figure 1 for Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

Figure 2 for Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

Figure 3 for Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

Figure 4 for Knowledge-enhanced Relation Graph and Task Sampling for Few-shot Molecular Property Prediction

Abstract:Recently, few-shot molecular property prediction (FSMPP) has garnered increasing attention. Despite impressive breakthroughs achieved by existing methods, they often overlook the inherent many-to-many relationships between molecules and properties, which limits their performance. For instance, similar substructures of molecules can inspire the exploration of new compounds. Additionally, the relationships between properties can be quantified, with high-related properties providing more information in exploring the target property than those low-related. To this end, this paper proposes a novel meta-learning FSMPP framework (KRGTS), which comprises the Knowledge-enhanced Relation Graph module and the Task Sampling module. The knowledge-enhanced relation graph module constructs the molecule-property multi-relation graph (MPMRG) to capture the many-to-many relationships between molecules and properties. The task sampling module includes a meta-training task sampler and an auxiliary task sampler, responsible for scheduling the meta-training process and sampling high-related auxiliary tasks, respectively, thereby achieving efficient meta-knowledge learning and reducing noise introduction. Empirically, extensive experiments on five datasets demonstrate the superiority of KRGTS over a variety of state-of-the-art methods. The code is available in https://github.com/Vencent-Won/KRGTS-public.

Via

Access Paper or Ask Questions

Exploring the Impact of Dataset Bias on Dataset Distillation

Mar 24, 2024

Yao Lu, Jianyang Gu, Xuguang Chen, Saeed Vahidian, Qi Xuan

Figure 1 for Exploring the Impact of Dataset Bias on Dataset Distillation

Figure 2 for Exploring the Impact of Dataset Bias on Dataset Distillation

Figure 3 for Exploring the Impact of Dataset Bias on Dataset Distillation

Figure 4 for Exploring the Impact of Dataset Bias on Dataset Distillation

Abstract:Dataset Distillation (DD) is a promising technique to synthesize a smaller dataset that preserves essential information from the original dataset. This synthetic dataset can serve as a substitute for the original large-scale one, and help alleviate the training workload. However, current DD methods typically operate under the assumption that the dataset is unbiased, overlooking potential bias issues within the dataset itself. To fill in this blank, we systematically investigate the influence of dataset bias on DD. To the best of our knowledge, this is the first exploration in the DD domain. Given that there are no suitable biased datasets for DD, we first construct two biased datasets, CMNIST-DD and CCIFAR10-DD, to establish a foundation for subsequent analysis. Then we utilize existing DD methods to generate synthetic datasets on CMNIST-DD and CCIFAR10-DD, and evaluate their performance following the standard process. Experiments demonstrate that biases present in the original dataset significantly impact the performance of the synthetic dataset in most cases, which highlights the necessity of identifying and mitigating biases in the original datasets during DD. Finally, we reformulate DD within the context of a biased dataset. Our code along with biased datasets are available at https://github.com/yaolu-zjut/Biased-DD.

Via

Access Paper or Ask Questions

A Federated Parameter Aggregation Method for Node Classification Tasks with Different Graph Network Structures

Mar 24, 2024

Hao Song, Jiacheng Yao, Zhengxi Li, Shaocong Xu, Shibo Jin, Jiajun Zhou, Chenbo Fu, Qi Xuan, Shanqing Yu

Figure 1 for A Federated Parameter Aggregation Method for Node Classification Tasks with Different Graph Network Structures

Figure 2 for A Federated Parameter Aggregation Method for Node Classification Tasks with Different Graph Network Structures

Figure 3 for A Federated Parameter Aggregation Method for Node Classification Tasks with Different Graph Network Structures

Figure 4 for A Federated Parameter Aggregation Method for Node Classification Tasks with Different Graph Network Structures

Abstract:Over the past few years, federated learning has become widely used in various classical machine learning fields because of its collaborative ability to train data from multiple sources without compromising privacy. However, in the area of graph neural networks, the nodes and network structures of graphs held by clients are different in many practical applications, and the aggregation method that directly shares model gradients cannot be directly applied to this scenario. Therefore, this work proposes a federated aggregation method FLGNN applied to various graph federation scenarios and investigates the aggregation effect of parameter sharing at each layer of the graph neural network model. The effectiveness of the federated aggregation method FLGNN is verified by experiments on real datasets. Additionally, for the privacy security of FLGNN, this paper designs membership inference attack experiments and differential privacy defense experiments. The results show that FLGNN performs good robustness, and the success rate of privacy theft is further reduced by adding differential privacy defense methods.

Via

Access Paper or Ask Questions

Multi-Modal Representation Learning for Molecular Property Prediction: Sequence, Graph, Geometry

Jan 09, 2024

Zeyu Wang, Tianyi Jiang, Jinhuan Wang, Qi Xuan

Abstract:Molecular property prediction refers to the task of labeling molecules with some biochemical properties, playing a pivotal role in the drug discovery and design process. Recently, with the advancement of machine learning, deep learning-based molecular property prediction has emerged as a solution to the resource-intensive nature of traditional methods, garnering significant attention. Among them, molecular representation learning is the key factor for molecular property prediction performance. And there are lots of sequence-based, graph-based, and geometry-based methods that have been proposed. However, the majority of existing studies focus solely on one modality for learning molecular representations, failing to comprehensively capture molecular characteristics and information. In this paper, a novel multi-modal representation learning model, which integrates the sequence, graph, and geometry characteristics, is proposed for molecular property prediction, called SGGRL. Specifically, we design a fusion layer to fusion the representation of different modalities. Furthermore, to ensure consistency across modalities, SGGRL is trained to maximize the similarity of representations for the same molecule while minimizing similarity for different molecules. To verify the effectiveness of SGGRL, seven molecular datasets, and several baselines are used for evaluation and comparison. The experimental results demonstrate that SGGRL consistently outperforms the baselines in most cases. This further underscores the capability of SGGRL to comprehensively capture molecular information. Overall, the proposed SGGRL model showcases its potential to revolutionize molecular property prediction by leveraging multi-modal representation learning to extract diverse and comprehensive molecular insights. Our code is released at https://github.com/Vencent-Won/SGGRL.

* 8 pages, 3 figures

Via

Access Paper or Ask Questions

Augmenting Radio Signals with Wavelet Transform for Deep Learning-Based Modulation Recognition

Nov 07, 2023

Tao Chen, Shilian Zheng, Kunfeng Qiu, Luxin Zhang, Qi Xuan, Xiaoniu Yang

Abstract:The use of deep learning for radio modulation recognition has become prevalent in recent years. This approach automatically extracts high-dimensional features from large datasets, facilitating the accurate classification of modulation schemes. However, in real-world scenarios, it may not be feasible to gather sufficient training data in advance. Data augmentation is a method used to increase the diversity and quantity of training dataset and to reduce data sparsity and imbalance. In this paper, we propose data augmentation methods that involve replacing detail coefficients decomposed by discrete wavelet transform for reconstructing to generate new samples and expand the training set. Different generation methods are used to generate replacement sequences. Simulation results indicate that our proposed methods significantly outperform the other augmentation methods.

Via

Access Paper or Ask Questions

RK-core: An Established Methodology for Exploring the Hierarchical Structure within Datasets

Oct 10, 2023

Yao Lu, Yutian Huang, Jiaqi Nie, Zuohui Chen, Qi Xuan

Figure 1 for RK-core: An Established Methodology for Exploring the Hierarchical Structure within Datasets

Figure 2 for RK-core: An Established Methodology for Exploring the Hierarchical Structure within Datasets

Figure 3 for RK-core: An Established Methodology for Exploring the Hierarchical Structure within Datasets

Figure 4 for RK-core: An Established Methodology for Exploring the Hierarchical Structure within Datasets

Abstract:Recently, the field of machine learning has undergone a transition from model-centric to data-centric. The advancements in diverse learning tasks have been propelled by the accumulation of more extensive datasets, subsequently facilitating the training of larger models on these datasets. However, these datasets remain relatively under-explored. To this end, we introduce a pioneering approach known as RK-core, to empower gaining a deeper understanding of the intricate hierarchical structure within datasets. Across several benchmark datasets, we find that samples with low coreness values appear less representative of their respective categories, and conversely, those with high coreness values exhibit greater representativeness. Correspondingly, samples with high coreness values make a more substantial contribution to the performance in comparison to those with low coreness values. Building upon this, we further employ RK-core to analyze the hierarchical structure of samples with different coreset selection methods. Remarkably, we find that a high-quality coreset should exhibit hierarchical diversity instead of solely opting for representative samples. The code is available at https://github.com/yaolu-zjut/Kcore.

Via

Access Paper or Ask Questions