DNNs trained on natural clean samples have been shown to perform poorly on corrupted samples, such as noisy or blurry images. Various data augmentation methods have been recently proposed to improve DNN's robustness against common corruptions. Despite their success, they require computationally expensive training and cannot be applied to off-the-shelf trained models. Recently, it has been shown that updating BatchNorm (BN) statistics of an off-the-shelf model on a single corruption improves its accuracy on that corruption significantly. However, adopting the idea at inference time when the type of corruption is unknown and changing decreases the effectiveness of this method. In this paper, we harness the Fourier domain to detect the corruption type, a challenging task in the image domain. We propose a unified framework consisting of a corruption-detection model and BN statistics update that improves the corruption accuracy of any off-the-shelf trained model. We benchmark our framework on different models and datasets. Our results demonstrate about 8% and 4% accuracy improvement on CIFAR10-C and ImageNet-C, respectively. Furthermore, our framework can further improve the accuracy of state-of-the-art robust models, such as AugMix and DeepAug.
With the wide-spread application of machine learning models, it has become critical to study the potential data leakage of models trained on sensitive data. Recently, various membership inference (MI) attacks are proposed that determines if a sample was part of the training set or not. Although the first generation of MI attacks has been proven to be ineffective in practice, a few recent studies proposed practical MI attacks that achieve reasonable true positive rate at low false positive rate. The question is whether these attacks can be reliably used in practice. We showcase a practical application of membership inference attacks where it is used by an auditor (investigator) to prove to a judge/jury that an auditee unlawfully used sensitive data during training. Then, we show that the auditee can provide a dataset (with potentially unlimited number of samples) to a judge where MI attacks catastrophically fail. Hence, the auditee challenges the credibility of the auditor and can get the case dismissed. More importantly, we show that the auditee does not need to know anything about the MI attack neither a query access to it. In other words, all currently SOTA MI attacks in literature suffer from the same issue. Through comprehensive experimental evaluation, we show that our algorithms can increase the false positive rate from ten to thousands times larger than what auditor claim to the judge. Lastly, we argue that the implication of our algorithms is beyond discredibility: Current membership inference attacks can identify the memorized subpopulations, but they cannot reliably identify which exact sample in the subpopulation was used during training.
Membership inference attacks allow a malicious entity to predict whether a sample is used during training of a victim model or not. State-of-the-art membership inference attacks have shown to achieve good accuracy which poses a great privacy threat. However, majority of SOTA attacks require training dozens to hundreds of shadow models to accurately infer membership. This huge computation cost raises questions about practicality of these attacks on deep models. In this paper, we introduce a fundamentally different MI attack approach which obviates the need to train hundreds of shadow models. Simply put, we compare the victim model output on the target sample versus the samples from the same subpopulation (i.e., semantically similar samples), instead of comparing it with the output of hundreds of shadow models. The intuition is that the model response should not be significantly different between the target sample and its subpopulation if it was not a training sample. In cases where subpopulation samples are not available to the attacker, we show that training only a single generative model can fulfill the requirement. Hence, we achieve the state-of-the-art membership inference accuracy while significantly reducing the training computation cost.
Membership inference (MI) determines if a sample was part of a victim model training set. Recent development of MI attacks focus on record-level membership inference which limits their application in many real-world scenarios. For example, in the person re-identification task, the attacker (or investigator) is interested in determining if a user's images have been used during training or not. However, the exact training images might not be accessible to the attacker. In this paper, we develop a user-level MI attack where the goal is to find if any sample from the target user has been used during training even when no exact training sample is available to the attacker. We focus on metric embedding learning due to its dominance in person re-identification, where user-level MI attack is more sensible. We conduct an extensive evaluation on several datasets and show that our approach achieves high accuracy on user-level MI task.
Deep ensemble learning has been shown to improve accuracy by training multiple neural networks and fusing their outputs. Ensemble learning has also been used to defend against membership inference attacks that undermine privacy. In this paper, we empirically demonstrate a trade-off between these two goals, namely accuracy and privacy (in terms of membership inference attacks), in deep ensembles. Using a wide range of datasets and model architectures, we show that the effectiveness of membership inference attacks also increases when ensembling improves accuracy. To better understand this trade-off, we study the impact of various factors such as prediction confidence and agreement between models that constitute the ensemble. Finally, we evaluate defenses against membership inference attacks based on regularization and differential privacy. We show that while these defenses can mitigate the effectiveness of the membership inference attack, they simultaneously degrade ensemble accuracy. The source code is available at https://github.com/shrezaei/MI-on-EL.
Deep ensemble learning aims to improve the classification accuracy by training several neural networks and fusing their outputs. It has been widely shown to improve accuracy. At the same time, ensemble learning has also been proposed to mitigate privacy leakage in terms of membership inference (MI), where the goal of an attacker is to infer whether a particular data sample has been used to train a target model. In this paper, we show that these two goals of ensemble learning, namely improving accuracy and privacy, directly conflict with each other. Using a wide range of datasets and model architectures, we empirically demonstrate the trade-off between privacy and accuracy in deep ensemble learning. We find that ensembling can improve either privacy or accuracy, but not both simultaneously -- when ensembling improves the classification accuracy, the effectiveness of the MI attack also increases. We analyze various factors that contribute to such privacy leakage in ensembling such as prediction confidence and agreement between models that constitute the ensemble. Our evaluation of defenses against MI attacks, such as regularization and differential privacy, shows that they can mitigate the effectiveness of the MI attack but simultaneously degrade ensemble accuracy. The source code is available at https://github.com/shrezaei/MI-on-EL.
Recent studies propose membership inference (MI) attacks on deep models. Despite the moderate accuracy of such MI attacks, we show that the way the attack accuracy is reported is often misleading and a simple blind attack which is highly unreliable and inefficient in reality can often represent similar accuracy. We show that the current MI attack models can only identify the membership of misclassified samples with mediocre accuracy at best, which only constitute a very small portion of training samples. We analyze several new features that have not been explored for membership inference before, including distance to the decision boundary and gradient norms, and conclude that deep models' responses are mostly indistinguishable among train and non-train samples. Moreover, in contrast with general intuition that deeper models have a capacity to memorize training samples, and, hence, they are more vulnerable to membership inference, we find no evidence to support that and in some cases deeper models are often harder to launch membership inference attack on. Furthermore, despite the common belief, we show that overfitting does not necessarily lead to higher degree of membership leakage. We conduct experiments on MNIST, CIFAR-10, CIFAR-100, and ImageNet, using various model architecture, including LeNet, ResNet, DenseNet, InceptionV3, and Xception. Source code: https://github.com/shrezaei/MI-Attack}{\color{blue} {https://github.com/shrezaei/MI-Attack}.
Despite the plethora of studies about security vulnerabilities and defenses of deep learning models, security aspects of deep learning methodologies, such as transfer learning, have been rarely studied. In this article, we highlight the security challenges and research opportunities of these methodologies, focusing on vulnerabilities and attacks unique to them.
Traffic classification has various applications in today's Internet, from resource allocation, billing and QoS purposes in ISPs to firewall and malware detection in clients. Classical machine learning algorithms and deep learning models have been widely used to solve the traffic classification task. However, training such models requires a large amount of labeled data. Labeling data is often the most difficult and time-consuming process in building a classifier. To solve this challenge, we reformulate the traffic classification into a multi-task learning framework where bandwidth requirement and duration of a flow are predicted along with the traffic class. The motivation of this approach is twofold: First, bandwidth requirement and duration are useful in many applications, including routing, resource allocation, and QoS provisioning. Second, these two values can be obtained from each flow easily without the need for human labeling or capturing flows in a controlled and isolated environment. We show that with a large amount of easily obtainable data samples for bandwidth and duration prediction tasks, and only a few data samples for the traffic classification task, one can achieve high accuracy. We conduct two experiment with ISCX and QUIC public datasets and show the efficacy of our approach.
Due to the lack of enough training data and high computational cost to train a deep neural network from scratch, transfer learning has been extensively used in many deep-neural-network-based applications, such as face recognition, image classification, speech recognition, etc. A commonly-used transfer learning approach involves taking a part of a pre-trained model, adding a few layers at the end, and re-training the new layers with a small dataset. This approach, while efficient and widely used, imposes a security vulnerability because the pre-trained models used in transfer learning are usually available publicly to everyone, including potential attackers. In this paper, we show that without any additional knowledge other than the pre-trained model, an attacker can launch an effective and efficient brute force attack that can craft instances of input to trigger each target class with high confidence. Note that we assume that the attacker does not have access to any target-specific information, including samples from target classes, re-trained model, and probabilities assigned by Softmax to each class, and thus called target-agnostic attack. These assumptions render all previous attacks impractical, to the best of our knowledge. To evaluate the proposed attack, we perform a set of experiments on face recognition and speech recognition tasks and show the effectiveness of the attack. Our work sheds light on a fundamental security challenge of transfer learning in deep neural networks.