Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Backes

Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders

Jan 19, 2022
Zeyang Sha, Xinlei He, Ning Yu, Michael Backes, Yang Zhang

Figure 1 for Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders

Figure 2 for Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders

Figure 3 for Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders

Figure 4 for Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders

Unsupervised representation learning techniques have been developing rapidly to make full use of unlabeled images. They encode images into rich features that are oblivious to downstream tasks. Behind its revolutionary representation power, the requirements for dedicated model designs and a massive amount of computation resources expose image encoders to the risks of potential model stealing attacks -- a cheap way to mimic the well-trained encoder performance while circumventing the demanding requirements. Yet conventional attacks only target supervised classifiers given their predicted labels and/or posteriors, which leaves the vulnerability of unsupervised encoders unexplored. In this paper, we first instantiate the conventional stealing attacks against encoders and demonstrate their severer vulnerability compared with downstream classifiers. To better leverage the rich representation of encoders, we further propose Cont-Steal, a contrastive-learning-based attack, and validate its improved stealing effectiveness in various experiment settings. As a takeaway, we appeal to our community's attention to the intellectual property protection of representation learning techniques, especially to the defenses against encoder stealing attacks like ours.

Via

Access Paper or Ask Questions

Get a Model! Model Hijacking Attack Against Machine Learning Models

Nov 08, 2021
Ahmed Salem, Michael Backes, Yang Zhang

Figure 1 for Get a Model! Model Hijacking Attack Against Machine Learning Models

Figure 2 for Get a Model! Model Hijacking Attack Against Machine Learning Models

Figure 3 for Get a Model! Model Hijacking Attack Against Machine Learning Models

Figure 4 for Get a Model! Model Hijacking Attack Against Machine Learning Models

Machine learning (ML) has established itself as a cornerstone for various critical applications ranging from autonomous driving to authentication systems. However, with this increasing adoption rate of machine learning models, multiple attacks have emerged. One class of such attacks is training time attack, whereby an adversary executes their attack before or during the machine learning model training. In this work, we propose a new training time attack against computer vision based machine learning models, namely model hijacking attack. The adversary aims to hijack a target model to execute a different task than its original one without the model owner noticing. Model hijacking can cause accountability and security risks since a hijacked model owner can be framed for having their model offering illegal or unethical services. Model hijacking attacks are launched in the same way as existing data poisoning attacks. However, one requirement of the model hijacking attack is to be stealthy, i.e., the data samples used to hijack the target model should look similar to the model's original training dataset. To this end, we propose two different model hijacking attacks, namely Chameleon and Adverse Chameleon, based on a novel encoder-decoder style ML model, namely the Camouflager. Our evaluation shows that both of our model hijacking attacks achieve a high attack success rate, with a negligible drop in model utility.

* To Appear in NDSS 2022

Via

Access Paper or Ask Questions

Inference Attacks Against Graph Neural Networks

Oct 06, 2021
Zhikun Zhang, Min Chen, Michael Backes, Yun Shen, Yang Zhang

Figure 1 for Inference Attacks Against Graph Neural Networks

Figure 2 for Inference Attacks Against Graph Neural Networks

Figure 3 for Inference Attacks Against Graph Neural Networks

Figure 4 for Inference Attacks Against Graph Neural Networks

Graph is an important data representation ubiquitously existing in the real world. However, analyzing the graph data is computationally difficult due to its non-Euclidean nature. Graph embedding is a powerful tool to solve the graph analytics problem by transforming the graph data into low-dimensional vectors. These vectors could also be shared with third parties to gain additional insights of what is behind the data. While sharing graph embedding is intriguing, the associated privacy risks are unexplored. In this paper, we systematically investigate the information leakage of the graph embedding by mounting three inference attacks. First, we can successfully infer basic graph properties, such as the number of nodes, the number of edges, and graph density, of the target graph with up to 0.89 accuracy. Second, given a subgraph of interest and the graph embedding, we can determine with high confidence that whether the subgraph is contained in the target graph. For instance, we achieve 0.98 attack AUC on the DD dataset. Third, we propose a novel graph reconstruction attack that can reconstruct a graph that has similar graph structural statistics to the target graph. We further propose an effective defense mechanism based on graph embedding perturbation to mitigate the inference attacks without noticeable performance degradation for graph classification tasks. Our code is available at https://github.com/Zhangzhk0819/GNN-Embedding-Leaks.

* 19 pages, 18 figures. To Appear in the 31st USENIX Security Symposium

Via

Access Paper or Ask Questions

Mental Models of Adversarial Machine Learning

May 08, 2021
Lukas Bieringer, Kathrin Grosse, Michael Backes, Katharina Krombholz

Figure 1 for Mental Models of Adversarial Machine Learning

Figure 2 for Mental Models of Adversarial Machine Learning

Figure 3 for Mental Models of Adversarial Machine Learning

Figure 4 for Mental Models of Adversarial Machine Learning

Although machine learning (ML) is widely used in practice, little is known about practitioners' actual understanding of potential security challenges. In this work, we close this substantial gap in the literature and contribute a qualitative study focusing on developers' mental models of the ML pipeline and potentially vulnerable components. Studying mental models has helped in other security fields to discover root causes or improve risk communication. Our study reveals four characteristic ranges in mental models of industrial practitioners. The first range concerns the intertwined relationship of adversarial machine learning (AML) and classical security. The second range describes structural and functional components. The third range expresses individual variations of mental models, which are neither explained by the application nor by the educational background of the corresponding subjects. The fourth range corresponds to the varying levels of technical depth, which are however not determined by our subjects' level of knowledge. Our characteristic ranges have implications for the integration of AML into corporate workflows, security enhancing tools for practitioners, and creating appropriate regulatory frameworks for AML.

* 19 pages, 8 figures, under submission

Via

Access Paper or Ask Questions

Graph Unlearning

Mar 27, 2021
Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, Yang Zhang

The right to be forgotten states that a data subject has the right to erase their data from an entity storing it. In the context of machine learning (ML), it requires the ML model provider to remove the data subject's data from the training set used to build the ML model, a process known as \textit{machine unlearning}. While straightforward and legitimate, retraining the ML model from scratch upon receiving unlearning requests incurs high computational overhead when the training set is large. To address this issue, a number of approximate algorithms have been proposed in the domain of image and text data, among which SISA is the state-of-the-art solution. It randomly partitions the training set into multiple shards and trains a constituent model for each shard. However, directly applying SISA to the graph data can severely damage the graph structural information, and thereby the resulting ML model utility. In this paper, we propose GraphEraser, a novel machine unlearning method tailored to graph data. Its contributions include two novel graph partition algorithms, and a learning-based aggregation method. We conduct extensive experiments on five real-world datasets to illustrate the unlearning efficiency and model utility of GraphEraser. We observe that GraphEraser achieves 2.06$\times$ (small dataset) to 35.94$\times$ (large dataset) unlearning time improvement compared to retraining from scratch. On the other hand, GraphEraser achieves up to $62.5\%$ higher F1 score than that of random partitioning. In addition, our proposed learning-based aggregation method achieves up to $112\%$ higher F1 score than that of the majority vote aggregation.

Via

Access Paper or Ask Questions

Node-Level Membership Inference Attacks Against Graph Neural Networks

Feb 10, 2021
Xinlei He, Rui Wen, Yixin Wu, Michael Backes, Yun Shen, Yang Zhang

Figure 1 for Node-Level Membership Inference Attacks Against Graph Neural Networks

Figure 2 for Node-Level Membership Inference Attacks Against Graph Neural Networks

Figure 3 for Node-Level Membership Inference Attacks Against Graph Neural Networks

Figure 4 for Node-Level Membership Inference Attacks Against Graph Neural Networks

Many real-world data comes in the form of graphs, such as social networks and protein structure. To fully utilize the information contained in graph data, a new family of machine learning (ML) models, namely graph neural networks (GNNs), has been introduced. Previous studies have shown that machine learning models are vulnerable to privacy attacks. However, most of the current efforts concentrate on ML models trained on data from the Euclidean space, like images and texts. On the other hand, privacy risks stemming from GNNs remain largely unstudied. In this paper, we fill the gap by performing the first comprehensive analysis of node-level membership inference attacks against GNNs. We systematically define the threat models and propose three node-level membership inference attacks based on an adversary's background knowledge. Our evaluation on three GNN structures and four benchmark datasets shows that GNNs are vulnerable to node-level membership inference even when the adversary has minimal background knowledge. Besides, we show that graph density and feature similarity have a major impact on the attack's success. We further investigate two defense mechanisms and the empirical results indicate that these defenses can reduce the attack performance but with moderate utility loss.

Via

Access Paper or Ask Questions

ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Feb 04, 2021
Yugeng Liu, Rui Wen, Xinlei He, Ahmed Salem, Zhikun Zhang, Michael Backes, Emiliano De Cristofaro, Mario Fritz, Yang Zhang

Figure 1 for ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Figure 2 for ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Figure 3 for ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Figure 4 for ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Inference attacks against Machine Learning (ML) models allow adversaries to learn information about training data, model parameters, etc. While researchers have studied these attacks thoroughly, they have done so in isolation. We lack a comprehensive picture of the risks caused by the attacks, such as the different scenarios they can be applied to, the common factors that influence their performance, the relationship among them, or the effectiveness of defense techniques. In this paper, we fill this gap by presenting a first-of-its-kind holistic risk assessment of different inference attacks against machine learning models. We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing - and establish a threat model taxonomy. Our extensive experimental evaluation conducted over five model architectures and four datasets shows that the complexity of the training dataset plays an important role with respect to the attack's performance, while the effectiveness of model stealing and membership inference attacks are negatively correlated. We also show that defenses like DP-SGD and Knowledge Distillation can only hope to mitigate some of the inference attacks. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models, and equally serves as a benchmark tool for researchers and practitioners.

Via

Access Paper or Ask Questions

BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine Learning Models

Oct 08, 2020
Ahmed Salem, Yannick Sautter, Michael Backes, Mathias Humbert, Yang Zhang

Figure 1 for BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine Learning Models

Figure 2 for BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine Learning Models

Figure 3 for BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine Learning Models

Figure 4 for BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine Learning Models

The tremendous progress of autoencoders and generative adversarial networks (GANs) has led to their application to multiple critical tasks, such as fraud detection and sanitized data generation. This increasing adoption has fostered the study of security and privacy risks stemming from these models. However, previous works have mainly focused on membership inference attacks. In this work, we explore one of the most severe attacks against machine learning models, namely the backdoor attack, against both autoencoders and GANs. The backdoor attack is a training time attack where the adversary implements a hidden backdoor in the target model that can only be activated by a secret trigger. State-of-the-art backdoor attacks focus on classification-based tasks. We extend the applicability of backdoor attacks to autoencoders and GAN-based models. More concretely, we propose the first backdoor attack against autoencoders and GANs where the adversary can control what the decoded or generated images are when the backdoor is activated. Our results show that the adversary can build a backdoored autoencoder that returns a target output for all backdoored inputs, while behaving perfectly normal on clean inputs. Similarly, for the GANs, our experiments show that the adversary can generate data from a different distribution when the backdoor is activated, while maintaining the same utility when the backdoor is not.

Via

Access Paper or Ask Questions

Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks

Oct 07, 2020
Ahmed Salem, Michael Backes, Yang Zhang

Figure 1 for Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks

Figure 2 for Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks

Figure 3 for Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks

Figure 4 for Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks

Backdoor attack against deep neural networks is currently being profoundly investigated due to its severe security consequences. Current state-of-the-art backdoor attacks require the adversary to modify the input, usually by adding a trigger to it, for the target model to activate the backdoor. This added trigger not only increases the difficulty of launching the backdoor attack in the physical world, but also can be easily detected by multiple defense mechanisms. In this paper, we present the first triggerless backdoor attack against deep neural networks, where the adversary does not need to modify the input for triggering the backdoor. Our attack is based on the dropout technique. Concretely, we associate a set of target neurons that are dropped out during model training with the target label. In the prediction phase, the model will output the target label when the target neurons are dropped again, i.e., the backdoor attack is launched. This triggerless feature of our attack makes it practical in the physical world. Extensive experiments show that our triggerless backdoor attack achieves a perfect attack success rate with a negligible damage to the model's utility.

Via

Access Paper or Ask Questions

Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

Sep 10, 2020
Yang Zou, Zhikun Zhang, Michael Backes, Yang Zhang

Figure 1 for Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

Figure 2 for Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

Figure 3 for Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

Figure 4 for Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

While being deployed in many critical applications as core components, machine learning (ML) models are vulnerable to various security and privacy attacks. One major privacy attack in this domain is membership inference, where an adversary aims to determine whether a target data sample is part of the training set of a target ML model. So far, most of the current membership inference attacks are evaluated against ML models trained from scratch. However, real-world ML models are typically trained following the transfer learning paradigm, where a model owner takes a pretrained model learned from a different dataset, namely teacher model, and trains her own student model by fine-tuning the teacher model with her own data. In this paper, we perform the first systematic evaluation of membership inference attacks against transfer learning models. We adopt the strategy of shadow model training to derive the data for training our membership inference classifier. Extensive experiments on four real-world image datasets show that membership inference can achieve effective performance. For instance, on the CIFAR100 classifier transferred from ResNet20 (pretrained with Caltech101), our membership inference achieves $95\%$ attack AUC. Moreover, we show that membership inference is still effective when the architecture of target model is unknown. Our results shed light on the severity of membership risks stemming from machine learning models in practice.

Via

Access Paper or Ask Questions