Alert button
Picture for Dimitar I. Dimitrov

Dimitar I. Dimitrov

Alert button

Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning

Jun 16, 2023
Kostadin Garov, Dimitar I. Dimitrov, Nikola Jovanović, Martin Vechev

Figure 1 for Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning
Figure 2 for Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning
Figure 3 for Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning
Figure 4 for Hiding in Plain Sight: Disguising Data Stealing Attacks in Federated Learning

Malicious server (MS) attacks have enabled the scaling of data stealing in federated learning to large batch sizes and secure aggregation, settings previously considered private. However, many concerns regarding client-side detectability of MS attacks were raised, questioning their practicality once they are publicly known. In this work, for the first time, we thoroughly study the problem of client-side detectability.We demonstrate that most prior MS attacks, which fundamentally rely on one of two key principles, are detectable by principled client-side checks. Further, we formulate desiderata for practical MS attacks and propose SEER, a novel attack framework that satisfies all desiderata, while stealing user data from gradients of realistic networks, even for large batch sizes (up to 512 in our experiments) and under secure aggregation. The key insight of SEER is the use of a secret decoder, which is jointly trained with the shared model. Our work represents a promising first step towards more principled treatment of MS attacks, paving the way for realistic data stealing that can compromise user privacy in real-world deployments.

Viaarxiv icon

FARE: Provably Fair Representation Learning

Oct 13, 2022
Nikola Jovanović, Mislav Balunović, Dimitar I. Dimitrov, Martin Vechev

Figure 1 for FARE: Provably Fair Representation Learning
Figure 2 for FARE: Provably Fair Representation Learning
Figure 3 for FARE: Provably Fair Representation Learning
Figure 4 for FARE: Provably Fair Representation Learning

Fair representation learning (FRL) is a popular class of methods aiming to produce fair classifiers via data preprocessing. However, recent work has shown that prior methods achieve worse accuracy-fairness tradeoffs than originally suggested by their results. This dictates the need for FRL methods that provide provable upper bounds on unfairness of any downstream classifier, a challenge yet unsolved. In this work we address this challenge and propose Fairness with Restricted Encoders (FARE), the first FRL method with provable fairness guarantees. Our key insight is that restricting the representation space of the encoder enables us to derive suitable fairness guarantees, while allowing empirical accuracy-fairness tradeoffs comparable to prior work. FARE instantiates this idea with a tree-based encoder, a choice motivated by inherent advantages of decision trees when applied in our setting. Crucially, we develop and apply a practical statistical procedure that computes a high-confidence upper bound on the unfairness of any downstream classifier. In our experimental evaluation on several datasets and settings we demonstrate that FARE produces tight upper bounds, often comparable with empirical results of prior methods, which establishes the practical value of our approach.

Viaarxiv icon

Data Leakage in Tabular Federated Learning

Oct 04, 2022
Mark Vero, Mislav Balunović, Dimitar I. Dimitrov, Martin Vechev

Figure 1 for Data Leakage in Tabular Federated Learning
Figure 2 for Data Leakage in Tabular Federated Learning
Figure 3 for Data Leakage in Tabular Federated Learning
Figure 4 for Data Leakage in Tabular Federated Learning

While federated learning (FL) promises to preserve privacy in distributed training of deep learning models, recent work in the image and NLP domains showed that training updates leak private data of participating clients. At the same time, most high-stakes applications of FL (e.g., legal and financial) use tabular data. Compared to the NLP and image domains, reconstruction of tabular data poses several unique challenges: (i) categorical features introduce a significantly more difficult mixed discrete-continuous optimization problem, (ii) the mix of categorical and continuous features causes high variance in the final reconstructions, and (iii) structured data makes it difficult for the adversary to judge reconstruction quality. In this work, we tackle these challenges and propose the first comprehensive reconstruction attack on tabular data, called TabLeak. TabLeak is based on three key ingredients: (i) a softmax structural prior, implicitly converting the mixed discrete-continuous optimization problem into an easier fully continuous one, (ii) a way to reduce the variance of our reconstructions through a pooled ensembling scheme exploiting the structure of tabular data, and (iii) an entropy measure which can successfully assess reconstruction quality. Our experimental evaluation demonstrates the effectiveness of TabLeak, reaching a state-of-the-art on four popular tabular datasets. For instance, on the Adult dataset, we improve attack accuracy by 10% compared to the baseline on the practically relevant batch size of 32 and further obtain non-trivial reconstructions for batch sizes as large as 128. Our findings are important as they show that performing FL on tabular data, which often poses high privacy risks, is highly vulnerable.

Viaarxiv icon

Data Leakage in Federated Averaging

Jun 27, 2022
Dimitar I. Dimitrov, Mislav Balunović, Nikola Konstantinov, Martin Vechev

Figure 1 for Data Leakage in Federated Averaging
Figure 2 for Data Leakage in Federated Averaging
Figure 3 for Data Leakage in Federated Averaging
Figure 4 for Data Leakage in Federated Averaging

Recent attacks have shown that user data can be recovered from FedSGD updates, thus breaking privacy. However, these attacks are of limited practical relevance as federated learning typically uses the FedAvg algorithm. Compared to FedSGD, recovering data from FedAvg updates is much harder as: (i) the updates are computed at unobserved intermediate network weights, (ii) a large number of batches are used, and (iii) labels and network weights vary simultaneously across client steps. In this work, we propose a new optimization-based attack which successfully attacks FedAvg by addressing the above challenges. First, we solve the optimization problem using automatic differentiation that forces a simulation of the client's update that generates the unobserved parameters for the recovered labels and inputs to match the received client update. Second, we address the large number of batches by relating images from different epochs with a permutation invariant prior. Third, we recover the labels by estimating the parameters of existing FedSGD attacks at every FedAvg step. On the popular FEMNIST dataset, we demonstrate that on average we successfully recover >45% of the client's images from realistic FedAvg updates computed on 10 local epochs of 10 batches each with 5 images, compared to only <10% using the baseline. Our findings show many real-world federated learning implementations based on FedAvg are vulnerable.

Viaarxiv icon

LAMP: Extracting Text from Gradients with Language Model Priors

Feb 17, 2022
Dimitar I. Dimitrov, Mislav Balunović, Nikola Jovanović, Martin Vechev

Figure 1 for LAMP: Extracting Text from Gradients with Language Model Priors
Figure 2 for LAMP: Extracting Text from Gradients with Language Model Priors
Figure 3 for LAMP: Extracting Text from Gradients with Language Model Priors
Figure 4 for LAMP: Extracting Text from Gradients with Language Model Priors

Recent work shows that sensitive user data can be reconstructed from gradient updates, breaking the key privacy promise of federated learning. While success was demonstrated primarily on image data, these methods do not directly transfer to other domains such as text. In this work, we propose LAMP, a novel attack tailored to textual data, that successfully reconstructs original text from gradients. Our key insight is to model the prior probability of the text with an auxiliary language model, utilizing it to guide the search towards more natural text. Concretely, LAMP introduces a discrete text transformation procedure that minimizes both the reconstruction loss and the prior text probability, as provided by the auxiliary language model. The procedure is alternated with a continuous optimization of the reconstruction loss, which also regularizes the length of the reconstructed embeddings. Our experiments demonstrate that LAMP reconstructs the original text significantly more precisely than prior work: we recover 5x more bigrams and $23\%$ longer subsequences on average. Moreover, we are first to recover inputs from batch sizes larger than 1 for textual models. These findings indicate that gradient updates of models operating on textual data leak more information than previously thought.

Viaarxiv icon

Bayesian Framework for Gradient Leakage

Nov 08, 2021
Mislav Balunović, Dimitar I. Dimitrov, Robin Staab, Martin Vechev

Figure 1 for Bayesian Framework for Gradient Leakage
Figure 2 for Bayesian Framework for Gradient Leakage
Figure 3 for Bayesian Framework for Gradient Leakage
Figure 4 for Bayesian Framework for Gradient Leakage

Federated learning is an established method for training machine learning models without sharing training data. However, recent work has shown that it cannot guarantee data privacy as shared gradients can still leak sensitive information. To formalize the problem of gradient leakage, we propose a theoretical framework that enables, for the first time, analysis of the Bayes optimal adversary phrased as an optimization problem. We demonstrate that existing leakage attacks can be seen as approximations of this optimal adversary with different assumptions on the probability distributions of the input data and gradients. Our experiments confirm the effectiveness of the Bayes optimal adversary when it has knowledge of the underlying distribution. Further, our experimental evaluation shows that several existing heuristic defenses are not effective against stronger attacks, especially early in the training process. Thus, our findings indicate that the construction of more effective defenses and their evaluation remains an open problem.

Viaarxiv icon

Shared Certificates for Neural Network Verification

Sep 14, 2021
Christian Sprecher, Marc Fischer, Dimitar I. Dimitrov, Gagandeep Singh, Martin Vechev

Figure 1 for Shared Certificates for Neural Network Verification
Figure 2 for Shared Certificates for Neural Network Verification
Figure 3 for Shared Certificates for Neural Network Verification
Figure 4 for Shared Certificates for Neural Network Verification

Existing neural network verifiers compute a proof that each input is handled correctly under a given perturbation by propagating a convex set of reachable values at each layer. This process is repeated independently for each input (e.g., image) and perturbation (e.g., rotation), leading to an expensive overall proof effort when handling an entire dataset. In this work we introduce a new method for reducing this verification cost based on the key insight that convex sets obtained at intermediate layers can overlap across different inputs and perturbations. Leveraging this insight, we introduce the general concept of shared certificates, enabling proof effort reuse across multiple inputs and driving down overall verification costs. We validate our insight via an extensive experimental evaluation and demonstrate the effectiveness of shared certificates on a range of datasets and attack specifications including geometric, patch and $\ell_\infty$ input perturbations.

Viaarxiv icon

Proof Transfer for Neural Network Verification

Sep 01, 2021
Christian Sprecher, Marc Fischer, Dimitar I. Dimitrov, Gagandeep Singh, Martin Vechev

Figure 1 for Proof Transfer for Neural Network Verification
Figure 2 for Proof Transfer for Neural Network Verification
Figure 3 for Proof Transfer for Neural Network Verification
Figure 4 for Proof Transfer for Neural Network Verification

We introduce the novel concept of proof transfer for neural network verification. We show that by generating proof templates that capture and generalize existing proofs, we can speed up subsequent proofs. In particular we create these templates from previous proofs on the same neural network and consider two cases: (i) where the proofs are created online when verifying other properties and (ii) where the templates are created offline using a dataset. We base our methods on three key hypotheses of neural network robustness proofs. Our evaluation shows the potential of proof transfer for benefitting robustness verification of neural networks against adversarial patches, geometric, and $\ell_{\infty}$-perturbations.

Viaarxiv icon

Scalable Inference of Symbolic Adversarial Examples

Jul 26, 2020
Dimitar I. Dimitrov, Gagandeep Singh, Timon Gehr, Martin Vechev

Figure 1 for Scalable Inference of Symbolic Adversarial Examples
Figure 2 for Scalable Inference of Symbolic Adversarial Examples
Figure 3 for Scalable Inference of Symbolic Adversarial Examples
Figure 4 for Scalable Inference of Symbolic Adversarial Examples

We present a novel method for generating symbolic adversarial examples: input regions guaranteed to only contain adversarial examples for the given neural network. These regions can generate real-world adversarial examples as they summarize trillions of adversarial examples. We theoretically show that computing optimal symbolic adversarial examples is computationally expensive. We present a method for approximating optimal examples in a scalable manner. Our method first selectively uses adversarial attacks to generate a candidate region and then prunes this region with hyperplanes that fit points obtained via specialized sampling. It iterates until arriving at a symbolic adversarial example for which it can prove, via state-of-the-art convex relaxation techniques, that the region only contains adversarial examples. Our experimental results demonstrate that our method is practically effective: it only needs a few thousand attacks to infer symbolic summaries guaranteed to contain $\approx 10^{258}$ adversarial examples.

Viaarxiv icon