Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kamalika Chaudhuri

Privacy Amplification by Subsampling in Time Domain

Jan 13, 2022
Tatsuki Koga, Casey Meehan, Kamalika Chaudhuri

Figure 1 for Privacy Amplification by Subsampling in Time Domain

Figure 2 for Privacy Amplification by Subsampling in Time Domain

Figure 3 for Privacy Amplification by Subsampling in Time Domain

Figure 4 for Privacy Amplification by Subsampling in Time Domain

Aggregate time-series data like traffic flow and site occupancy repeatedly sample statistics from a population across time. Such data can be profoundly useful for understanding trends within a given population, but also pose a significant privacy risk, potentially revealing e.g., who spends time where. Producing a private version of a time-series satisfying the standard definition of Differential Privacy (DP) is challenging due to the large influence a single participant can have on the sequence: if an individual can contribute to each time step, the amount of additive noise needed to satisfy privacy increases linearly with the number of time steps sampled. As such, if a signal spans a long duration or is oversampled, an excessive amount of noise must be added, drowning out underlying trends. However, in many applications an individual realistically cannot participate at every time step. When this is the case, we observe that the influence of a single participant (sensitivity) can be reduced by subsampling and/or filtering in time, while still meeting privacy requirements. Using a novel analysis, we show this significant reduction in sensitivity and propose a corresponding class of privacy mechanisms. We demonstrate the utility benefits of these techniques empirically with real-world and synthetic time-series data.

Via

Access Paper or Ask Questions

Privacy Amplification via Shuffling for Linear Contextual Bandits

Dec 11, 2021
Evrard Garcelon, Kamalika Chaudhuri, Vianney Perchet, Matteo Pirotta

Figure 1 for Privacy Amplification via Shuffling for Linear Contextual Bandits

Figure 2 for Privacy Amplification via Shuffling for Linear Contextual Bandits

Contextual bandit algorithms are widely used in domains where it is desirable to provide a personalized service by leveraging contextual information, that may contain sensitive information that needs to be protected. Inspired by this scenario, we study the contextual linear bandit problem with differential privacy (DP) constraints. While the literature has focused on either centralized (joint DP) or local (local DP) privacy, we consider the shuffle model of privacy and we show that is possible to achieve a privacy/utility trade-off between JDP and LDP. By leveraging shuffling from privacy and batching from bandits, we present an algorithm with regret bound $\widetilde{\mathcal{O}}(T^{2/3}/\varepsilon^{1/3})$, while guaranteeing both central (joint) and local privacy. Our result shows that it is possible to obtain a trade-off between JDP and LDP by leveraging the shuffle model while preserving local privacy.

Via

Access Paper or Ask Questions

Behavior of k-NN as an Instance-Based Explanation Method

Sep 14, 2021
Chhavi Yadav, Kamalika Chaudhuri

Figure 1 for Behavior of k-NN as an Instance-Based Explanation Method

Figure 2 for Behavior of k-NN as an Instance-Based Explanation Method

Figure 3 for Behavior of k-NN as an Instance-Based Explanation Method

Figure 4 for Behavior of k-NN as an Instance-Based Explanation Method

Adoption of DL models in critical areas has led to an escalating demand for sound explanation methods. Instance-based explanation methods are a popular type that return selective instances from the training set to explain the predictions for a test sample. One way to connect these explanations with prediction is to ask the following counterfactual question - how does the loss and prediction for a test sample change when explanations are removed from the training set? Our paper answers this question for k-NNs which are natural contenders for an instance-based explanation method. We first demonstrate empirically that the representation space induced by last layer of a neural network is the best to perform k-NN in. Using this layer, we conduct our experiments and compare them to influence functions (IFs) ~\cite{koh2017understanding} which try to answer a similar question. Our evaluations do indicate change in loss and predictions when explanations are removed but we do not find a trend between $k$ and loss or prediction change. We find significant stability in the predictions and loss of MNIST vs. CIFAR-10. Surprisingly, we do not observe much difference in the behavior of k-NNs vs. IFs on this question. We attribute this to training set subsampling for IFs.

Via

Access Paper or Ask Questions

A Shuffling Framework for Local Differential Privacy

Jun 11, 2021
Casey Meehan, Amrita Roy Chowdhury, Kamalika Chaudhuri, Somesh Jha

Figure 1 for A Shuffling Framework for Local Differential Privacy

Figure 2 for A Shuffling Framework for Local Differential Privacy

Figure 3 for A Shuffling Framework for Local Differential Privacy

Figure 4 for A Shuffling Framework for Local Differential Privacy

ldp deployments are vulnerable to inference attacks as an adversary can link the noisy responses to their identity and subsequently, auxiliary information using the order of the data. An alternative model, shuffle DP, prevents this by shuffling the noisy responses uniformly at random. However, this limits the data learnability -- only symmetric functions (input order agnostic) can be learned. In this paper, we strike a balance and propose a generalized shuffling framework that interpolates between the two deployment models. We show that systematic shuffling of the noisy responses can thwart specific inference attacks while retaining some meaningful data learnability. To this end, we propose a novel privacy guarantee, d-sigma privacy, that captures the privacy of the order of a data sequence. d-sigma privacy allows tuning the granularity at which the ordinal information is maintained, which formalizes the degree the resistance to inference attacks trading it off with data learnability. Additionally, we propose a novel shuffling mechanism that can achieve d-sigma privacy and demonstrate the practicality of our mechanism via evaluation on real-world datasets.

Via

Access Paper or Ask Questions

Understanding Instance-based Interpretability of Variational Auto-Encoders

May 29, 2021
Zhifeng Kong, Kamalika Chaudhuri

Figure 1 for Understanding Instance-based Interpretability of Variational Auto-Encoders

Figure 2 for Understanding Instance-based Interpretability of Variational Auto-Encoders

Figure 3 for Understanding Instance-based Interpretability of Variational Auto-Encoders

Figure 4 for Understanding Instance-based Interpretability of Variational Auto-Encoders

Instance-based interpretation methods have been widely studied for supervised learning methods as they help explain how black box neural networks predict. However, instance-based interpretations remain ill-understood in the context of unsupervised learning. In this paper, we investigate influence functions [20], a popular instance-based interpretation method, for a class of deep generative models called variational auto-encoders (VAE). We formally frame the counter-factual question answered by influence functions in this setting, and through theoretical analysis, examine what they reveal about the impact of training samples on classical unsupervised learning methods. We then introduce VAE-TracIn, a computationally efficient and theoretically sound solution based on Pruthi et al. [28], for VAEs. Finally, we evaluate VAE-TracIn on several real world datasets with extensive quantitative and qualitative analysis.

Via

Access Paper or Ask Questions

Privacy Amplification Via Bernoulli Sampling

May 21, 2021
Jacob Imola, Kamalika Chaudhuri

Figure 1 for Privacy Amplification Via Bernoulli Sampling

Figure 2 for Privacy Amplification Via Bernoulli Sampling

Figure 3 for Privacy Amplification Via Bernoulli Sampling

Balancing privacy and accuracy is a major challenge in designing differentially private machine learning algorithms. To improve this tradeoff, prior work has looked at privacy amplification methods which analyze how common training operations such as iteration and subsampling the data can lead to higher privacy. In this paper, we analyze privacy amplification properties of a new operation, sampling from the posterior, that is used in Bayesian inference. In particular, we look at Bernoulli sampling from a posterior that is described by a differentially private parameter. We provide an algorithm to compute the amplification factor in this setting, and establish upper and lower bounds on this factor. Finally, we look at what happens when we draw k posterior samples instead of one.

* 17 pages, 5 figures

Via

Access Paper or Ask Questions

Universal Approximation of Residual Flows in Maximum Mean Discrepancy

Mar 10, 2021
Zhifeng Kong, Kamalika Chaudhuri

Normalizing flows are a class of flexible deep generative models that offer easy likelihood computation. Despite their empirical success, there is little theoretical understanding of their expressiveness. In this work, we study residual flows, a class of normalizing flows composed of Lipschitz residual blocks. We prove residual flows are universal approximators in maximum mean discrepancy. We provide upper bounds on the number of residual blocks to achieve approximation under different assumptions.

* 8 pages

Via

Access Paper or Ask Questions

Location Trace Privacy Under Conditional Priors

Feb 23, 2021
Casey Meehan, Kamalika Chaudhuri

Figure 1 for Location Trace Privacy Under Conditional Priors

Figure 2 for Location Trace Privacy Under Conditional Priors

Providing meaningful privacy to users of location based services is particularly challenging when multiple locations are revealed in a short period of time. This is primarily due to the tremendous degree of dependence that can be anticipated between points. We propose a R\'enyi divergence based privacy framework for bounding expected privacy loss for conditionally dependent data. Additionally, we demonstrate an algorithm for achieving this privacy under Gaussian process conditional priors. This framework both exemplifies why conditionally dependent data is so challenging to protect and offers a strategy for preserving privacy to within a fixed radius for sensitive locations in a user's trace.

* To be published in the proceedings of AISTATS 2021

Via

Access Paper or Ask Questions

Consistent Non-Parametric Methods for Adaptive Robustness

Feb 18, 2021
Robi Bhattacharjee, Kamalika Chaudhuri

Figure 1 for Consistent Non-Parametric Methods for Adaptive Robustness

Figure 2 for Consistent Non-Parametric Methods for Adaptive Robustness

Learning classifiers that are robust to adversarial examples has received a great deal of recent attention. A major drawback of the standard robust learning framework is the imposition of an artificial robustness radius $r$ that applies to all inputs, and ignores the fact that data may be highly heterogeneous. In this paper, we address this limitation by proposing a new framework for adaptive robustness, called neighborhood preserving robustness. We present sufficient conditions under which general non-parametric methods that can be represented as weight functions satisfy our notion of robustness, and show that both nearest neighbors and kernel classifiers satisfy these conditions in the large sample limit.

Via

Access Paper or Ask Questions

Connecting Interpretability and Robustness in Decision Trees through Separation

Feb 14, 2021
Michal Moshkovitz, Yao-Yuan Yang, Kamalika Chaudhuri

Figure 1 for Connecting Interpretability and Robustness in Decision Trees through Separation

Figure 2 for Connecting Interpretability and Robustness in Decision Trees through Separation

Figure 3 for Connecting Interpretability and Robustness in Decision Trees through Separation

Figure 4 for Connecting Interpretability and Robustness in Decision Trees through Separation

Recent research has recognized interpretability and robustness as essential properties of trustworthy classification. Curiously, a connection between robustness and interpretability was empirically observed, but the theoretical reasoning behind it remained elusive. In this paper, we rigorously investigate this connection. Specifically, we focus on interpretation using decision trees and robustness to $l_{\infty}$-perturbation. Previous works defined the notion of $r$-separation as a sufficient condition for robustness. We prove upper and lower bounds on the tree size in case the data is $r$-separated. We then show that a tighter bound on the size is possible when the data is linearly separated. We provide the first algorithm with provable guarantees both on robustness, interpretability, and accuracy in the context of decision trees. Experiments confirm that our algorithm yields classifiers that are both interpretable and robust and have high accuracy. The code for the experiments is available at https://github.com/yangarbiter/interpretable-robust-trees .

Via

Access Paper or Ask Questions