Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Neil Zhenqiang Gong

PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees

Mar 03, 2023
Jinghuai Zhang, Jinyuan Jia, Hongbin Liu, Neil Zhenqiang Gong

Figure 1 for PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees

Figure 2 for PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees

Figure 3 for PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees

Figure 4 for PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees

Point cloud classification is an essential component in many security-critical applications such as autonomous driving and augmented reality. However, point cloud classifiers are vulnerable to adversarially perturbed point clouds. Existing certified defenses against adversarial point clouds suffer from a key limitation: their certified robustness guarantees are probabilistic, i.e., they produce an incorrect certified robustness guarantee with some probability. In this work, we propose a general framework, namely PointCert, that can transform an arbitrary point cloud classifier to be certifiably robust against adversarial point clouds with deterministic guarantees. PointCert certifiably predicts the same label for a point cloud when the number of arbitrarily added, deleted, and/or modified points is less than a threshold. Moreover, we propose multiple methods to optimize the certified robustness guarantees of PointCert in three application scenarios. We systematically evaluate PointCert on ModelNet and ScanObjectNN benchmark datasets. Our results show that PointCert substantially outperforms state-of-the-art certified defenses even though their robustness guarantees are probabilistic.

* CVPR 2023

Via

Access Paper or Ask Questions

REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service

Jan 07, 2023
Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong

Figure 1 for REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service

Figure 2 for REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service

Figure 3 for REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service

Figure 4 for REaaS: Enabling Adversarially Robust Downstream Classifiers via Robust Encoder as a Service

Encoder as a service is an emerging cloud service. Specifically, a service provider first pre-trains an encoder (i.e., a general-purpose feature extractor) via either supervised learning or self-supervised learning and then deploys it as a cloud service API. A client queries the cloud service API to obtain feature vectors for its training/testing inputs when training/testing its classifier (called downstream classifier). A downstream classifier is vulnerable to adversarial examples, which are testing inputs with carefully crafted perturbation that the downstream classifier misclassifies. Therefore, in safety and security critical applications, a client aims to build a robust downstream classifier and certify its robustness guarantees against adversarial examples. What APIs should the cloud service provide, such that a client can use any certification method to certify the robustness of its downstream classifier against adversarial examples while minimizing the number of queries to the APIs? How can a service provider pre-train an encoder such that clients can build more certifiably robust downstream classifiers? We aim to answer the two questions in this work. For the first question, we show that the cloud service only needs to provide two APIs, which we carefully design, to enable a client to certify the robustness of its downstream classifier with a minimal number of queries to the APIs. For the second question, we show that an encoder pre-trained using a spectral-norm regularization term enables clients to build more robust downstream classifiers.

* To appear in Network and Distributed System Security (NDSS) Symposium, 2023

Via

Access Paper or Ask Questions

AFLGuard: Byzantine-robust Asynchronous Federated Learning

Dec 13, 2022
Minghong Fang, Jia Liu, Neil Zhenqiang Gong, Elizabeth S. Bentley

Figure 1 for AFLGuard: Byzantine-robust Asynchronous Federated Learning

Figure 2 for AFLGuard: Byzantine-robust Asynchronous Federated Learning

Figure 3 for AFLGuard: Byzantine-robust Asynchronous Federated Learning

Figure 4 for AFLGuard: Byzantine-robust Asynchronous Federated Learning

Federated learning (FL) is an emerging machine learning paradigm, in which clients jointly learn a model with the help of a cloud server. A fundamental challenge of FL is that the clients are often heterogeneous, e.g., they have different computing powers, and thus the clients may send model updates to the server with substantially different delays. Asynchronous FL aims to address this challenge by enabling the server to update the model once any client's model update reaches it without waiting for other clients' model updates. However, like synchronous FL, asynchronous FL is also vulnerable to poisoning attacks, in which malicious clients manipulate the model via poisoning their local data and/or model updates sent to the server. Byzantine-robust FL aims to defend against poisoning attacks. In particular, Byzantine-robust FL can learn an accurate model even if some clients are malicious and have Byzantine behaviors. However, most existing studies on Byzantine-robust FL focused on synchronous FL, leaving asynchronous FL largely unexplored. In this work, we bridge this gap by proposing AFLGuard, a Byzantine-robust asynchronous FL method. We show that, both theoretically and empirically, AFLGuard is robust against various existing and adaptive poisoning attacks (both untargeted and targeted). Moreover, AFLGuard outperforms existing Byzantine-robust asynchronous FL methods.

* Accepted by ACSAC 2022

Via

Access Paper or Ask Questions

Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

Dec 06, 2022
Hongbin Liu, Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong

Figure 1 for Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

Figure 2 for Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

Figure 3 for Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

Figure 4 for Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

Classifiers in supervised learning have various security and privacy issues, e.g., 1) data poisoning attacks, backdoor attacks, and adversarial examples on the security side as well as 2) inference attacks and the right to be forgotten for the training data on the privacy side. Various secure and privacy-preserving supervised learning algorithms with formal guarantees have been proposed to address these issues. However, they suffer from various limitations such as accuracy loss, small certified security guarantees, and/or inefficiency. Self-supervised learning is an emerging technique to pre-train encoders using unlabeled data. Given a pre-trained encoder as a feature extractor, supervised learning can train a simple yet accurate classifier using a small amount of labeled training data. In this work, we perform the first systematic, principled measurement study to understand whether and when a pre-trained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms. Our key findings are that a pre-trained encoder substantially improves 1) both accuracy under no attacks and certified security guarantees against data poisoning and backdoor attacks of state-of-the-art secure learning algorithms (i.e., bagging and KNN), 2) certified security guarantees of randomized smoothing against adversarial examples without sacrificing its accuracy under no attacks, 3) accuracy of differentially private classifiers, and 4) accuracy and/or efficiency of exact machine unlearning.

Via

Access Paper or Ask Questions

CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive Learning

Nov 22, 2022
Jinghuai Zhang, Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong

Figure 1 for CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive Learning

Figure 2 for CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive Learning

Figure 3 for CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive Learning

Figure 4 for CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive Learning

Contrastive learning (CL) pre-trains general-purpose encoders using an unlabeled pre-training dataset, which consists of images (called single-modal CL) or image-text pairs (called multi-modal CL). CL is vulnerable to data poisoning based backdoor attacks (DPBAs), in which an attacker injects poisoned inputs into the pre-training dataset so the encoder is backdoored. However, existing DPBAs achieve limited effectiveness. In this work, we propose new DPBAs called CorruptEncoder to CL. Our experiments show that CorruptEncoder substantially outperforms existing DPBAs for both single-modal and multi-modal CL. CorruptEncoder is the first DPBA that achieves more than 90% attack success rates on single-modal CL with only a few (3) reference images and a small poisoning ratio (0.5%). Moreover, we also propose a defense, called localized cropping, to defend single-modal CL against DPBAs. Our results show that our defense can reduce the effectiveness of DPBAs, but it sacrifices the utility of the encoder, highlighting the needs of new defenses.

Via

Access Paper or Ask Questions

Addressing Heterogeneity in Federated Learning via Distributional Transformation

Oct 26, 2022
Haolin Yuan, Bo Hui, Yuchen Yang, Philippe Burlina, Neil Zhenqiang Gong, Yinzhi Cao

Federated learning (FL) allows multiple clients to collaboratively train a deep learning model. One major challenge of FL is when data distribution is heterogeneous, i.e., differs from one client to another. Existing personalized FL algorithms are only applicable to narrow cases, e.g., one or two data classes per client, and therefore they do not satisfactorily address FL under varying levels of data heterogeneity. In this paper, we propose a novel framework, called DisTrans, to improve FL performance (i.e., model accuracy) via train and test-time distributional transformations along with a double-input-channel model structure. DisTrans works by optimizing distributional offsets and models for each FL client to shift their data distribution, and aggregates these offsets at the FL server to further improve performance in case of distributional heterogeneity. Our evaluation on multiple benchmark datasets shows that DisTrans outperforms state-of-the-art FL methods and data augmentation methods under various settings and different degrees of client distributional heterogeneity.

* In the Proceedings of European Conference on Computer Vision (ECCV), 2022

Via

Access Paper or Ask Questions

FLCert: Provably Secure Federated Learning against Poisoning Attacks

Oct 04, 2022
Xiaoyu Cao, Zaixi Zhang, Jinyuan Jia, Neil Zhenqiang Gong

Figure 1 for FLCert: Provably Secure Federated Learning against Poisoning Attacks

Figure 2 for FLCert: Provably Secure Federated Learning against Poisoning Attacks

Figure 3 for FLCert: Provably Secure Federated Learning against Poisoning Attacks

Figure 4 for FLCert: Provably Secure Federated Learning against Poisoning Attacks

Due to its distributed nature, federated learning is vulnerable to poisoning attacks, in which malicious clients poison the training process via manipulating their local training data and/or local model updates sent to the cloud server, such that the poisoned global model misclassifies many indiscriminate test inputs or attacker-chosen ones. Existing defenses mainly leverage Byzantine-robust federated learning methods or detect malicious clients. However, these defenses do not have provable security guarantees against poisoning attacks and may be vulnerable to more advanced attacks. In this work, we aim to bridge the gap by proposing FLCert, an ensemble federated learning framework, that is provably secure against poisoning attacks with a bounded number of malicious clients. Our key idea is to divide the clients into groups, learn a global model for each group of clients using any existing federated learning method, and take a majority vote among the global models to classify a test input. Specifically, we consider two methods to group the clients and propose two variants of FLCert correspondingly, i.e., FLCert-P that randomly samples clients in each group, and FLCert-D that divides clients to disjoint groups deterministically. Our extensive experiments on multiple datasets show that the label predicted by our FLCert for a test input is provably unaffected by a bounded number of malicious clients, no matter what poisoning attacks they use.

* To appear in Transactions on Information Forensics and Security. arXiv admin note: text overlap with arXiv:2102.01854

Via

Access Paper or Ask Questions

MultiGuard: Provably Robust Multi-label Classification against Adversarial Examples

Oct 03, 2022
Jinyuan Jia, Wenjie Qu, Neil Zhenqiang Gong

Figure 1 for MultiGuard: Provably Robust Multi-label Classification against Adversarial Examples

Figure 2 for MultiGuard: Provably Robust Multi-label Classification against Adversarial Examples

Figure 3 for MultiGuard: Provably Robust Multi-label Classification against Adversarial Examples

Figure 4 for MultiGuard: Provably Robust Multi-label Classification against Adversarial Examples

Multi-label classification, which predicts a set of labels for an input, has many applications. However, multiple recent studies showed that multi-label classification is vulnerable to adversarial examples. In particular, an attacker can manipulate the labels predicted by a multi-label classifier for an input via adding carefully crafted, human-imperceptible perturbation to it. Existing provable defenses for multi-class classification achieve sub-optimal provable robustness guarantees when generalized to multi-label classification. In this work, we propose MultiGuard, the first provably robust defense against adversarial examples to multi-label classification. Our MultiGuard leverages randomized smoothing, which is the state-of-the-art technique to build provably robust classifiers. Specifically, given an arbitrary multi-label classifier, our MultiGuard builds a smoothed multi-label classifier via adding random noise to the input. We consider isotropic Gaussian noise in this work. Our major theoretical contribution is that we show a certain number of ground truth labels of an input are provably in the set of labels predicted by our MultiGuard when the $\ell_2$-norm of the adversarial perturbation added to the input is bounded. Moreover, we design an algorithm to compute our provable robustness guarantees. Empirically, we evaluate our MultiGuard on VOC 2007, MS-COCO, and NUS-WIDE benchmark datasets. Our code is available at: \url{https://github.com/quwenjie/MultiGuard}

* Accepted by NeurIPS 2022

Via

Access Paper or Ask Questions

Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

Jul 25, 2022
Xinlei He, Hongbin Liu, Neil Zhenqiang Gong, Yang Zhang

Figure 1 for Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

Figure 2 for Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

Figure 3 for Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

Figure 4 for Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

Semi-supervised learning (SSL) leverages both labeled and unlabeled data to train machine learning (ML) models. State-of-the-art SSL methods can achieve comparable performance to supervised learning by leveraging much fewer labeled data. However, most existing works focus on improving the performance of SSL. In this work, we take a different angle by studying the training data privacy of SSL. Specifically, we propose the first data augmentation-based membership inference attacks against ML models trained by SSL. Given a data sample and the black-box access to a model, the goal of membership inference attack is to determine whether the data sample belongs to the training dataset of the model. Our evaluation shows that the proposed attack can consistently outperform existing membership inference attacks and achieves the best performance against the model trained by SSL. Moreover, we uncover that the reason for membership leakage in SSL is different from the commonly believed one in supervised learning, i.e., overfitting (the gap between training and testing accuracy). We observe that the SSL model is well generalized to the testing data (with almost 0 overfitting) but ''memorizes'' the training data by giving a more confident prediction regardless of its correctness. We also explore early stopping as a countermeasure to prevent membership inference attacks against SSL. The results show that early stopping can mitigate the membership inference attack, but with the cost of model's utility degradation.

* Accepted to ECCV 2022

Via

Access Paper or Ask Questions