Get our free extension to see links to code for papers anywhere online!Free extension: code links for papers anywhere!Free add-on: See code for papers anywhere!

Alireza F. Pour, Hassan Ashtiani, Shahab Asoodeh

We study the problem of hypothesis selection under the constraint of local differential privacy. Given a class $\mathcal{F}$ of $k$ distributions and a set of i.i.d. samples from an unknown distribution $h$, the goal of hypothesis selection is to pick a distribution $\hat{f}$ whose total variation distance to $h$ is comparable with the best distribution in $\mathcal{F}$ (with high probability). We devise an $\varepsilon$-locally-differentially-private ($\varepsilon$-LDP) algorithm that uses $\Theta\left(\frac{k}{\alpha^2\min \{\varepsilon^2,1\}}\right)$ samples to guarantee that $d_{TV}(h,\hat{f})\leq \alpha + 9 \min_{f\in \mathcal{F}}d_{TV}(h,f)$ with high probability. This sample complexity is optimal for $\varepsilon<1$, matching the lower bound of Gopi et al. (2020). All previously known algorithms for this problem required $\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$ samples to work. Moreover, our result demonstrates the power of interaction for $\varepsilon$-LDP hypothesis selection. Namely, it breaks the known lower bound of $\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$ for the sample complexity of non-interactive hypothesis selection. Our algorithm breaks this barrier using only $\Theta(\log \log k)$ rounds of interaction. To prove our results, we define the notion of \emph{critical queries} for a Statistical Query Algorithm (SQA) which may be of independent interest. Informally, an SQA is said to use a small number of critical queries if its success relies on the accuracy of only a small number of queries it asks. We then design an LDP algorithm that uses a smaller number of critical queries.

Via

Shahab Asoodeh, Mario Diaz

The Noisy-SGD algorithm is widely used for privately training machine learning models. Traditional privacy analyses of this algorithm assume that the internal state is publicly revealed, resulting in privacy loss bounds that increase indefinitely with the number of iterations. However, recent findings have shown that if the internal state remains hidden, then the privacy loss might remain bounded. Nevertheless, this remarkable result heavily relies on the assumption of (strong) convexity of the loss function. It remains an important open problem to further relax this condition while proving similar convergent upper bounds on the privacy loss. In this work, we address this problem for DP-SGD, a popular variant of Noisy-SGD that incorporates gradient clipping to limit the impact of individual samples on the training process. Our findings demonstrate that the privacy loss of projected DP-SGD converges exponentially fast, without requiring convexity or smoothness assumptions on the loss function. In addition, we analyze the privacy loss of regularized (unprojected) DP-SGD. To obtain these results, we directly analyze the hockey-stick divergence between coupled stochastic processes by relying on non-linear data processing inequalities.

Via

Shahab Asoodeh, Huanyu Zhang

We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between $PK$ and $QK$ output distributions of an $\epsilon$-LDP mechanism $K$ in terms of a divergence between the corresponding input distributions $P$ and $Q$, respectively. Our first main technical result presents a sharp upper bound on the $\chi^2$-divergence $\chi^2(PK\|QK)$ in terms of $\chi^2(P\|Q)$ and $\epsilon$. We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on $\chi^2(PK\|QK)$ in terms of total variation distance $TV(P, Q)$ and $\epsilon$. We then utilize these bounds to establish locally private versions of the Cram\'er-Rao bound, Le Cam's, Assouad's, and the mutual information methods, which are powerful tools for bounding minimax estimation risks. These results are shown to lead to better privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.

Via

Wael Alghamdi, Shahab Asoodeh, Flavio P. Calmon, Juan Felipe Gomez, Oliver Kosut, Lalitha Sankar, Fei Wei

We introduce a new differential privacy (DP) accountant called the saddle-point accountant (SPA). SPA approximates privacy guarantees for the composition of DP mechanisms in an accurate and fast manner. Our approach is inspired by the saddle-point method -- a ubiquitous numerical technique in statistics. We prove rigorous performance guarantees by deriving upper and lower bounds for the approximation error offered by SPA. The crux of SPA is a combination of large-deviation methods with central limit theorems, which we derive via exponentially tilting the privacy loss random variables corresponding to the DP mechanisms. One key advantage of SPA is that it runs in constant time for the $n$-fold composition of a privacy mechanism. Numerical experiments demonstrate that SPA achieves comparable accuracy to state-of-the-art accounting methods with a faster runtime.

Via

Wael Alghamdi, Shahab Asoodeh, Flavio P. Calmon, Oliver Kosut, Lalitha Sankar, Fei Wei

Most differential privacy mechanisms are applied (i.e., composed) numerous times on sensitive data. We study the design of optimal differential privacy mechanisms in the limit of a large number of compositions. As a consequence of the law of large numbers, in this regime the best privacy mechanism is the one that minimizes the Kullback-Leibler divergence between the conditional output distributions of the mechanism given two different inputs. We formulate an optimization problem to minimize this divergence subject to a cost constraint on the noise. We first prove that additive mechanisms are optimal. Since the optimization problem is infinite dimensional, it cannot be solved directly; nevertheless, we quantize the problem to derive near-optimal additive mechanisms that we call "cactus mechanisms" due to their shape. We show that our quantization approach can be arbitrarily close to an optimal mechanism. Surprisingly, for quadratic cost, the Gaussian mechanism is strictly sub-optimal compared to this cactus mechanism. Finally, we provide numerical results which indicate that cactus mechanism outperforms the Gaussian mechanism for a finite number of compositions.

Via

Wael Alghamdi, Hsiang Hsu, Haewon Jeong, Hao Wang, P. Winston Michalak, Shahab Asoodeh, Flavio P. Calmon

We consider the problem of producing fair probabilistic classifiers for multi-class classification tasks. We formulate this problem in terms of "projecting" a pre-trained (and potentially unfair) classifier onto the set of models that satisfy target group-fairness requirements. The new, projected model is given by post-processing the outputs of the pre-trained classifier by a multiplicative factor. We provide a parallelizable iterative algorithm for computing the projected classifier and derive both sample complexity and convergence guarantees. Comprehensive numerical comparisons with state-of-the-art benchmarks demonstrate that our approach maintains competitive performance in terms of accuracy-fairness trade-off curves, while achieving favorable runtime on large datasets. We also evaluate our method at scale on an open dataset with multiple classes, multiple intersectional protected groups, and over 1M samples.

Via

Shahab Asoodeh, Maryam Aliakbarpour, Flavio P. Calmon

We investigate the local differential privacy (LDP) guarantees of a randomized privacy mechanism via its contraction properties. We first show that LDP constraints can be equivalently cast in terms of the contraction coefficient of the $E_\gamma$-divergence. We then use this equivalent formula to express LDP guarantees of privacy mechanisms in terms of contraction coefficients of arbitrary $f$-divergences. When combined with standard estimation-theoretic tools (such as Le Cam's and Fano's converse methods), this result allows us to study the trade-off between privacy and utility in several testing and minimax and Bayesian estimation problems.

Via

Shahab Asoodeh, Mario Diaz, Flavio P. Calmon

We propose an information-theoretic technique for analyzing privacy guarantees of online algorithms. Specifically, we demonstrate that differential privacy guarantees of iterative algorithms can be determined by a direct application of contraction coefficients derived from strong data processing inequalities for $f$-divergences. Our technique relies on generalizing the Dobrushin's contraction coefficient for total variation distance to an $f$-divergence known as $E_\gamma$-divergence. $E_\gamma$-divergence, in turn, is equivalent to approximate differential privacy. As an example, we apply our technique to derive the differential privacy parameters of gradient descent. Moreover, we also show that this framework can be tailored to batch learning algorithms that can be implemented with one pass over the training dataset.

Via

Shahab Asoodeh, Flavio Calmon

Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems which have found applications in machine learning, design of privacy algorithms, capacity problems (e.g., Mrs. Gerber's Lemma), strong data processing inequalities, among others. In this work, we first investigate the functional properties of IB and PF through a unified theoretical framework. We then connect them to three information-theoretic coding problems, namely hypothesis testing against independence, noisy source coding and dependence dilution. Leveraging these connections, we prove a new cardinality bound for the auxiliary variable in IB, making its computation more tractable for discrete random variables. In the second part, we introduce a general family of optimization problems, termed as \textit{bottleneck problems}, by replacing mutual information in IB and PF with other notions of mutual information, namely $f$-information and Arimoto's mutual information. We then argue that, unlike IB and PF, these problems lead to easily interpretable guarantee in a variety of inference tasks with statistical constraints on accuracy and privacy. Although the underlying optimization problems are non-convex, we develop a technique to evaluate bottleneck problems in closed form by equivalently expressing them in terms of lower convex or upper concave envelope of certain functions. By applying this technique to binary case, we derive closed form expressions for several bottleneck problems.

Via

Shahab Asoodeh, Jiachun Liao, Flavio P. Calmon, Oliver Kosut, Lalitha Sankar

We consider three different variants of differential privacy (DP), namely approximate DP, R\'enyi DP (RDP), and hypothesis test DP. In the first part, we develop a machinery for optimally relating approximate DP to RDP based on the joint range of two $f$-divergences that underlie the approximate DP and RDP. In particular, this enables us to derive the optimal approximate DP parameters of a mechanism that satisfies a given level of RDP. As an application, we apply our result to the moments accountant framework for characterizing privacy guarantees of noisy stochastic gradient descent (SGD). When compared to the state-of-the-art, our bounds may lead to about 100 more stochastic gradient descent iterations for training deep learning models for the same privacy budget. In the second part, we establish a relationship between RDP and hypothesis test DP which allows us to translate the RDP constraint into a tradeoff between type I and type II error probabilities of a certain binary hypothesis test. We then demonstrate that for noisy SGD our result leads to tighter privacy guarantees compared to the recently proposed $f$-DP framework for some range of parameters.

Via