Alert button
Picture for Zhan Qin

Zhan Qin

Alert button

RemovalNet: DNN Fingerprint Removal Attacks

Aug 31, 2023
Hongwei Yao, Zheng Li, Kunzhe Huang, Jian Lou, Zhan Qin, Kui Ren

With the performance of deep neural networks (DNNs) remarkably improving, DNNs have been widely used in many areas. Consequently, the DNN model has become a valuable asset, and its intellectual property is safeguarded by ownership verification techniques (e.g., DNN fingerprinting). However, the feasibility of the DNN fingerprint removal attack and its potential influence remains an open problem. In this paper, we perform the first comprehensive investigation of DNN fingerprint removal attacks. Generally, the knowledge contained in a DNN model can be categorized into general semantic and fingerprint-specific knowledge. To this end, we propose a min-max bilevel optimization-based DNN fingerprint removal attack named RemovalNet, to evade model ownership verification. The lower-level optimization is designed to remove fingerprint-specific knowledge. While in the upper-level optimization, we distill the victim model's general semantic knowledge to maintain the surrogate model's performance. We conduct extensive experiments to evaluate the fidelity, effectiveness, and efficiency of the RemovalNet against four advanced defense methods on six metrics. The empirical results demonstrate that (1) the RemovalNet is effective. After our DNN fingerprint removal attack, the model distance between the target and surrogate models is x100 times higher than that of the baseline attacks, (2) the RemovalNet is efficient. It uses only 0.2% (400 samples) of the substitute dataset and 1,000 iterations to conduct our attack. Besides, compared with advanced model stealing attacks, the RemovalNet saves nearly 85% of computational resources at most, (3) the RemovalNet achieves high fidelity that the created surrogate model maintains high accuracy after the DNN fingerprint removal process. Our code is available at: https://github.com/grasses/RemovalNet.

* some mistake 
Viaarxiv icon

FINER: Enhancing State-of-the-art Classifiers with Feature Attribution to Facilitate Security Analysis

Aug 10, 2023
Yiling He, Jian Lou, Zhan Qin, Kui Ren

Deep learning classifiers achieve state-of-the-art performance in various risk detection applications. They explore rich semantic representations and are supposed to automatically discover risk behaviors. However, due to the lack of transparency, the behavioral semantics cannot be conveyed to downstream security experts to reduce their heavy workload in security analysis. Although feature attribution (FA) methods can be used to explain deep learning, the underlying classifier is still blind to what behavior is suspicious, and the generated explanation cannot adapt to downstream tasks, incurring poor explanation fidelity and intelligibility. In this paper, we propose FINER, the first framework for risk detection classifiers to generate high-fidelity and high-intelligibility explanations. The high-level idea is to gather explanation efforts from model developer, FA designer, and security experts. To improve fidelity, we fine-tune the classifier with an explanation-guided multi-task learning strategy. To improve intelligibility, we engage task knowledge to adjust and ensemble FA methods. Extensive evaluations show that FINER improves explanation quality for risk detection. Moreover, we demonstrate that FINER outperforms a state-of-the-art tool in facilitating malware analysis.

Viaarxiv icon

FDINet: Protecting against DNN Model Extraction via Feature Distortion Index

Jun 22, 2023
Hongwei Yao, Zheng Li, Haiqin Weng, Feng Xue, Kui Ren, Zhan Qin

Figure 1 for FDINet: Protecting against DNN Model Extraction via Feature Distortion Index
Figure 2 for FDINet: Protecting against DNN Model Extraction via Feature Distortion Index
Figure 3 for FDINet: Protecting against DNN Model Extraction via Feature Distortion Index
Figure 4 for FDINet: Protecting against DNN Model Extraction via Feature Distortion Index

Machine Learning as a Service (MLaaS) platforms have gained popularity due to their accessibility, cost-efficiency, scalability, and rapid development capabilities. However, recent research has highlighted the vulnerability of cloud-based models in MLaaS to model extraction attacks. In this paper, we introduce FDINET, a novel defense mechanism that leverages the feature distribution of deep neural network (DNN) models. Concretely, by analyzing the feature distribution from the adversary's queries, we reveal that the feature distribution of these queries deviates from that of the model's training set. Based on this key observation, we propose Feature Distortion Index (FDI), a metric designed to quantitatively measure the feature distribution deviation of received queries. The proposed FDINET utilizes FDI to train a binary detector and exploits FDI similarity to identify colluding adversaries from distributed extraction attacks. We conduct extensive experiments to evaluate FDINET against six state-of-the-art extraction attacks on four benchmark datasets and four popular model architectures. Empirical results demonstrate the following findings FDINET proves to be highly effective in detecting model extraction, achieving a 100% detection accuracy on DFME and DaST. FDINET is highly efficient, using just 50 queries to raise an extraction alarm with an average confidence of 96.08% for GTSRB. FDINET exhibits the capability to identify colluding adversaries with an accuracy exceeding 91%. Additionally, it demonstrates the ability to detect two types of adaptive attacks.

* 13 pages, 7 figures 
Viaarxiv icon

FDInet: Protecting against DNN Model Extraction via Feature Distortion Index

Jun 20, 2023
Hongwei Yao, Zheng Li, Haiqin Weng, Feng Xue, Kui Ren, Zhan Qin

Figure 1 for FDInet: Protecting against DNN Model Extraction via Feature Distortion Index
Figure 2 for FDInet: Protecting against DNN Model Extraction via Feature Distortion Index
Figure 3 for FDInet: Protecting against DNN Model Extraction via Feature Distortion Index
Figure 4 for FDInet: Protecting against DNN Model Extraction via Feature Distortion Index

Machine Learning as a Service (MLaaS) platforms have gained popularity due to their accessibility, cost-efficiency, scalability, and rapid development capabilities. However, recent research has highlighted the vulnerability of cloud-based models in MLaaS to model extraction attacks. In this paper, we introduce FDINET, a novel defense mechanism that leverages the feature distribution of deep neural network (DNN) models. Concretely, by analyzing the feature distribution from the adversary's queries, we reveal that the feature distribution of these queries deviates from that of the model's training set. Based on this key observation, we propose Feature Distortion Index (FDI), a metric designed to quantitatively measure the feature distribution deviation of received queries. The proposed FDINET utilizes FDI to train a binary detector and exploits FDI similarity to identify colluding adversaries from distributed extraction attacks. We conduct extensive experiments to evaluate FDINET against six state-of-the-art extraction attacks on four benchmark datasets and four popular model architectures. Empirical results demonstrate the following findings FDINET proves to be highly effective in detecting model extraction, achieving a 100% detection accuracy on DFME and DaST. FDINET is highly efficient, using just 50 queries to raise an extraction alarm with an average confidence of 96.08% for GTSRB. FDINET exhibits the capability to identify colluding adversaries with an accuracy exceeding 91%. Additionally, it demonstrates the ability to detect two types of adaptive attacks.

* 13 pages, 7 figures 
Viaarxiv icon

Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding

Apr 06, 2023
Yuke Hu, Wei Liang, Ruofan Wu, Kai Xiao, Weiqiang Wang, Xiaochen Li, Jinfei Liu, Zhan Qin

Figure 1 for Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding
Figure 2 for Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding
Figure 3 for Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding
Figure 4 for Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding

Knowledge Graph Embedding (KGE) is a fundamental technique that extracts expressive representation from knowledge graph (KG) to facilitate diverse downstream tasks. The emerging federated KGE (FKGE) collaboratively trains from distributed KGs held among clients while avoiding exchanging clients' sensitive raw KGs, which can still suffer from privacy threats as evidenced in other federated model trainings (e.g., neural networks). However, quantifying and defending against such privacy threats remain unexplored for FKGE which possesses unique properties not shared by previously studied models. In this paper, we conduct the first holistic study of the privacy threat on FKGE from both attack and defense perspectives. For the attack, we quantify the privacy threat by proposing three new inference attacks, which reveal substantial privacy risk by successfully inferring the existence of the KG triple from victim clients. For the defense, we propose DP-Flames, a novel differentially private FKGE with private selection, which offers a better privacy-utility tradeoff by exploiting the entity-binding sparse gradient property of FKGE and comes with a tight privacy accountant by incorporating the state-of-the-art private selection technique. We further propose an adaptive privacy budget allocation policy to dynamically adjust defense magnitude across the training procedure. Comprehensive evaluations demonstrate that the proposed defense can successfully mitigate the privacy threat by effectively reducing the success rate of inference attacks from $83.1\%$ to $59.4\%$ on average with only a modest utility decrease.

* Accepted in the ACM Web Conference (WWW 2023) 
Viaarxiv icon

FedTracker: Furnishing Ownership Verification and Traceability for Federated Learning Model

Nov 14, 2022
Shuo Shao, Wenyuan Yang, Hanlin Gu, Jian Lou, Zhan Qin, Lixin Fan, Qiang Yang, Kui Ren

Figure 1 for FedTracker: Furnishing Ownership Verification and Traceability for Federated Learning Model
Figure 2 for FedTracker: Furnishing Ownership Verification and Traceability for Federated Learning Model
Figure 3 for FedTracker: Furnishing Ownership Verification and Traceability for Federated Learning Model
Figure 4 for FedTracker: Furnishing Ownership Verification and Traceability for Federated Learning Model

Copyright protection of the Federated Learning (FL) model has become a major concern since malicious clients in FL can stealthily distribute or sell the FL model to other parties. In order to prevent such misbehavior, one must be able to catch the culprit by investigating trace evidence from the model in question. In this paper, we propose FedTracker, the first FL model protection framework that, on one hand, employs global watermarks to verify ownerships of the global model; and on the other hand, embed unique local fingerprints into respective local models to facilitate tracing the model back to the culprit. Furthermore, FedTracker introduces the intuition of Continual Learning (CL) into watermark embedding, and proposes a CL-based watermark mechanism to improve fidelity. Experimental results show that the proposed FedTracker is effective in ownership verification, traceability, fidelity, and robustness.

Viaarxiv icon

OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization

Oct 04, 2022
Xiaochen Li, Yuke Hu, Weiran Liu, Hanwen Feng, Li Peng, Yuan Hong, Kui Ren, Zhan Qin

Figure 1 for OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization
Figure 2 for OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization
Figure 3 for OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization
Figure 4 for OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization

Vertical Federated Learning (FL) is a new paradigm that enables users with non-overlapping attributes of the same data samples to jointly train a model without directly sharing the raw data. Nevertheless, recent works show that it's still not sufficient to prevent privacy leakage from the training process or the trained model. This paper focuses on studying the privacy-preserving tree boosting algorithms under the vertical FL. The existing solutions based on cryptography involve heavy computation and communication overhead and are vulnerable to inference attacks. Although the solution based on Local Differential Privacy (LDP) addresses the above problems, it leads to the low accuracy of the trained model. This paper explores to improve the accuracy of the widely deployed tree boosting algorithms satisfying differential privacy under vertical FL. Specifically, we introduce a framework called OpBoost. Three order-preserving desensitization algorithms satisfying a variant of LDP called distance-based LDP (dLDP) are designed to desensitize the training data. In particular, we optimize the dLDP definition and study efficient sampling distributions to further improve the accuracy and efficiency of the proposed algorithms. The proposed algorithms provide a trade-off between the privacy of pairs with large distance and the utility of desensitized values. Comprehensive evaluations show that OpBoost has a better performance on prediction accuracy of trained models compared with existing LDP approaches on reasonable settings. Our code is open source.

Viaarxiv icon

Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training

Jun 05, 2022
Guodong Cao, Zhibo Wang, Xiaowei Dong, Zhifei Zhang, Hengchang Guo, Zhan Qin, Kui Ren

Figure 1 for Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training
Figure 2 for Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training
Figure 3 for Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training
Figure 4 for Vanilla Feature Distillation for Improving the Accuracy-Robustness Trade-Off in Adversarial Training

Adversarial training has been widely explored for mitigating attacks against deep models. However, most existing works are still trapped in the dilemma between higher accuracy and stronger robustness since they tend to fit a model towards robust features (not easily tampered with by adversaries) while ignoring those non-robust but highly predictive features. To achieve a better robustness-accuracy trade-off, we propose the Vanilla Feature Distillation Adversarial Training (VFD-Adv), which conducts knowledge distillation from a pre-trained model (optimized towards high accuracy) to guide adversarial training towards higher accuracy, i.e., preserving those non-robust but predictive features. More specifically, both adversarial examples and their clean counterparts are forced to be aligned in the feature space by distilling predictive representations from the pre-trained/clean model, while previous works barely utilize predictive features from clean models. Therefore, the adversarial training model is updated towards maximally preserving the accuracy as gaining robustness. A key advantage of our method is that it can be universally adapted to and boost existing works. Exhaustive experiments on various datasets, classification models, and adversarial training algorithms demonstrate the effectiveness of our proposed method.

* 12 pages 
Viaarxiv icon

Backdoor Defense via Decoupling the Training Process

Feb 05, 2022
Kunzhe Huang, Yiming Li, Baoyuan Wu, Zhan Qin, Kui Ren

Figure 1 for Backdoor Defense via Decoupling the Training Process
Figure 2 for Backdoor Defense via Decoupling the Training Process
Figure 3 for Backdoor Defense via Decoupling the Training Process
Figure 4 for Backdoor Defense via Decoupling the Training Process

Recent studies have revealed that deep neural networks (DNNs) are vulnerable to backdoor attacks, where attackers embed hidden backdoors in the DNN model by poisoning a few training samples. The attacked model behaves normally on benign samples, whereas its prediction will be maliciously changed when the backdoor is activated. We reveal that poisoned samples tend to cluster together in the feature space of the attacked DNN model, which is mostly due to the end-to-end supervised training paradigm. Inspired by this observation, we propose a novel backdoor defense via decoupling the original end-to-end training process into three stages. Specifically, we first learn the backbone of a DNN model via \emph{self-supervised learning} based on training samples without their labels. The learned backbone will map samples with the same ground-truth label to similar locations in the feature space. Then, we freeze the parameters of the learned backbone and train the remaining fully connected layers via standard training with all (labeled) training samples. Lastly, to further alleviate side-effects of poisoned samples in the second stage, we remove labels of some `low-credible' samples determined based on the learned model and conduct a \emph{semi-supervised fine-tuning} of the whole model. Extensive experiments on multiple benchmark datasets and DNN models verify that the proposed defense is effective in reducing backdoor threats while preserving high accuracy in predicting benign samples. Our code is available at \url{https://github.com/SCLBD/DBD}.

* This work is accepted by the ICLR 2022. The first two authors contributed equally to this work. 25 pages 
Viaarxiv icon