Alert button
Picture for Phung Lai

Phung Lai

Alert button

Active Membership Inference Attack under Local Differential Privacy in Federated Learning

Feb 24, 2023
Truc Nguyen, Phung Lai, Khang Tran, NhatHai Phan, My T. Thai

Figure 1 for Active Membership Inference Attack under Local Differential Privacy in Federated Learning
Figure 2 for Active Membership Inference Attack under Local Differential Privacy in Federated Learning
Figure 3 for Active Membership Inference Attack under Local Differential Privacy in Federated Learning
Figure 4 for Active Membership Inference Attack under Local Differential Privacy in Federated Learning

Federated learning (FL) was originally regarded as a framework for collaborative learning among clients with data privacy protection through a coordinating server. In this paper, we propose a new active membership inference (AMI) attack carried out by a dishonest server in FL. In AMI attacks, the server crafts and embeds malicious parameters into global models to effectively infer whether a target data sample is included in a client's private training data or not. By exploiting the correlation among data features through a non-linear decision boundary, AMI attacks with a certified guarantee of success can achieve severely high success rates under rigorous local differential privacy (LDP) protection; thereby exposing clients' training data to significant privacy risk. Theoretical and experimental results on several benchmark datasets show that adding sufficient privacy-preserving noise to prevent our attack would significantly damage FL's model utility.

* To be published at AISTATS 2023 
Viaarxiv icon

XRand: Differentially Private Defense against Explanation-Guided Attacks

Dec 14, 2022
Truc Nguyen, Phung Lai, NhatHai Phan, My T. Thai

Figure 1 for XRand: Differentially Private Defense against Explanation-Guided Attacks
Figure 2 for XRand: Differentially Private Defense against Explanation-Guided Attacks
Figure 3 for XRand: Differentially Private Defense against Explanation-Guided Attacks
Figure 4 for XRand: Differentially Private Defense against Explanation-Guided Attacks

Recent development in the field of explainable artificial intelligence (XAI) has helped improve trust in Machine-Learning-as-a-Service (MLaaS) systems, in which an explanation is provided together with the model prediction in response to each query. However, XAI also opens a door for adversaries to gain insights into the black-box models in MLaaS, thereby making the models more vulnerable to several attacks. For example, feature-based explanations (e.g., SHAP) could expose the top important features that a black-box model focuses on. Such disclosure has been exploited to craft effective backdoor triggers against malware classifiers. To address this trade-off, we introduce a new concept of achieving local differential privacy (LDP) in the explanations, and from that we establish a defense, called XRand, against such attacks. We show that our mechanism restricts the information that the adversary can learn about the top important features, while maintaining the faithfulness of the explanations.

* To be published at AAAI 2023 
Viaarxiv icon

Heterogeneous Randomized Response for Differential Privacy in Graph Neural Networks

Nov 10, 2022
Khang Tran, Phung Lai, NhatHai Phan, Issa Khalil, Yao Ma, Abdallah Khreishah, My Thai, Xintao Wu

Figure 1 for Heterogeneous Randomized Response for Differential Privacy in Graph Neural Networks
Figure 2 for Heterogeneous Randomized Response for Differential Privacy in Graph Neural Networks
Figure 3 for Heterogeneous Randomized Response for Differential Privacy in Graph Neural Networks
Figure 4 for Heterogeneous Randomized Response for Differential Privacy in Graph Neural Networks

Graph neural networks (GNNs) are susceptible to privacy inference attacks (PIAs), given their ability to learn joint representation from features and edges among nodes in graph data. To prevent privacy leakages in GNNs, we propose a novel heterogeneous randomized response (HeteroRR) mechanism to protect nodes' features and edges against PIAs under differential privacy (DP) guarantees without an undue cost of data and model utility in training GNNs. Our idea is to balance the importance and sensitivity of nodes' features and edges in redistributing the privacy budgets since some features and edges are more sensitive or important to the model utility than others. As a result, we derive significantly better randomization probabilities and tighter error bounds at both levels of nodes' features and edges departing from existing approaches, thus enabling us to maintain high data utility for training GNNs. An extensive theoretical and empirical analysis using benchmark datasets shows that HeteroRR significantly outperforms various baselines in terms of model utility under rigorous privacy protection for both nodes' features and edges. That enables us to defend PIAs in DP-preserving GNNs effectively.

* Accepted in IEEE BigData 2022 (short paper) 
Viaarxiv icon

User-Entity Differential Privacy in Learning Natural Language Models

Nov 09, 2022
Phung Lai, NhatHai Phan, Tong Sun, Rajiv Jain, Franck Dernoncourt, Jiuxiang Gu, Nikolaos Barmpalios

Figure 1 for User-Entity Differential Privacy in Learning Natural Language Models
Figure 2 for User-Entity Differential Privacy in Learning Natural Language Models
Figure 3 for User-Entity Differential Privacy in Learning Natural Language Models
Figure 4 for User-Entity Differential Privacy in Learning Natural Language Models

In this paper, we introduce a novel concept of user-entity differential privacy (UeDP) to provide formal privacy protection simultaneously to both sensitive entities in textual data and data owners in learning natural language models (NLMs). To preserve UeDP, we developed a novel algorithm, called UeDP-Alg, optimizing the trade-off between privacy loss and model utility with a tight sensitivity bound derived from seamlessly combining user and sensitive entity sampling processes. An extensive theoretical analysis and evaluation show that our UeDP-Alg outperforms baseline approaches in model utility under the same privacy budget consumption on several NLM tasks, using benchmark datasets.

* Accepted at IEEE BigData 2022 
Viaarxiv icon

Lifelong DP: Consistently Bounded Differential Privacy in Lifelong Machine Learning

Jul 26, 2022
Phung Lai, Han Hu, NhatHai Phan, Ruoming Jin, My T. Thai, An M. Chen

Figure 1 for Lifelong DP: Consistently Bounded Differential Privacy in Lifelong Machine Learning
Figure 2 for Lifelong DP: Consistently Bounded Differential Privacy in Lifelong Machine Learning
Figure 3 for Lifelong DP: Consistently Bounded Differential Privacy in Lifelong Machine Learning
Figure 4 for Lifelong DP: Consistently Bounded Differential Privacy in Lifelong Machine Learning

In this paper, we show that the process of continually learning new tasks and memorizing previous tasks introduces unknown privacy risks and challenges to bound the privacy loss. Based upon this, we introduce a formal definition of Lifelong DP, in which the participation of any data tuples in the training set of any tasks is protected, under a consistently bounded DP protection, given a growing stream of tasks. A consistently bounded DP means having only one fixed value of the DP privacy budget, regardless of the number of tasks. To preserve Lifelong DP, we propose a scalable and heterogeneous algorithm, called L2DP-ML with a streaming batch training, to efficiently train and continue releasing new versions of an L2M model, given the heterogeneity in terms of data sizes and the training order of tasks, without affecting DP protection of the private training set. An end-to-end theoretical analysis and thorough evaluations show that our mechanism is significantly better than baseline approaches in preserving Lifelong DP. The implementation of L2DP-ML is available at: https://github.com/haiphanNJIT/PrivateDeepLearning.

Viaarxiv icon

Model Transferring Attacks to Backdoor HyperNetwork in Personalized Federated Learning

Jan 19, 2022
Phung Lai, NhatHai Phan, Abdallah Khreishah, Issa Khalil, Xintao Wu

This paper explores previously unknown backdoor risks in HyperNet-based personalized federated learning (HyperNetFL) through poisoning attacks. Based upon that, we propose a novel model transferring attack (called HNTROJ), i.e., the first of its kind, to transfer a local backdoor infected model to all legitimate and personalized local models, which are generated by the HyperNetFL model, through consistent and effective malicious local gradients computed across all compromised clients in the whole training process. As a result, HNTROJ reduces the number of compromised clients needed to successfully launch the attack without any observable signs of sudden shifts or degradation regarding model utility on legitimate data samples making our attack stealthy. To defend against HNTROJ, we adapted several backdoor-resistant FL training algorithms into HyperNetFL. An extensive experiment that is carried out using several benchmark datasets shows that HNTROJ significantly outperforms data poisoning and model replacement attacks and bypasses robust training algorithms.

Viaarxiv icon

Continual Learning with Differential Privacy

Oct 11, 2021
Pradnya Desai, Phung Lai, NhatHai Phan, My T. Thai

Figure 1 for Continual Learning with Differential Privacy
Figure 2 for Continual Learning with Differential Privacy
Figure 3 for Continual Learning with Differential Privacy
Figure 4 for Continual Learning with Differential Privacy

In this paper, we focus on preserving differential privacy (DP) in continual learning (CL), in which we train ML models to learn a sequence of new tasks while memorizing previous tasks. We first introduce a notion of continual adjacent databases to bound the sensitivity of any data record participating in the training process of CL. Based upon that, we develop a new DP-preserving algorithm for CL with a data sampling strategy to quantify the privacy risk of training data in the well-known Averaged Gradient Episodic Memory (A-GEM) approach by applying a moments accountant. Our algorithm provides formal guarantees of privacy for data records across tasks in CL. Preliminary theoretical analysis and evaluations show that our mechanism tightens the privacy loss while maintaining a promising model utility.

* The paper will appear at ICONIP21 
Viaarxiv icon

Ontology-based Interpretable Machine Learning for Textual Data

Apr 01, 2020
Phung Lai, NhatHai Phan, Han Hu, Anuja Badeti, David Newman, Dejing Dou

Figure 1 for Ontology-based Interpretable Machine Learning for Textual Data
Figure 2 for Ontology-based Interpretable Machine Learning for Textual Data
Figure 3 for Ontology-based Interpretable Machine Learning for Textual Data
Figure 4 for Ontology-based Interpretable Machine Learning for Textual Data

In this paper, we introduce a novel interpreting framework that learns an interpretable model based on an ontology-based sampling technique to explain agnostic prediction models. Different from existing approaches, our algorithm considers contextual correlation among words, described in domain knowledge ontologies, to generate semantic explanations. To narrow down the search space for explanations, which is a major problem of long and complicated text data, we design a learnable anchor algorithm, to better extract explanations locally. A set of regulations is further introduced, regarding combining learned interpretable representations with anchors to generate comprehensible semantic explanations. An extensive experiment conducted on two real-world datasets shows that our approach generates more precise and insightful explanations compared with baseline approaches.

* Accepted by IJCNN 2020 
Viaarxiv icon