Abstract:This monograph explores the design and analysis of correlated noise mechanisms for differential privacy (DP), focusing on their application to private training of AI and machine learning models via the core primitive of estimation of weighted prefix sums. While typical DP mechanisms inject independent noise into each step of a stochastic gradient (SGD) learning algorithm in order to protect the privacy of the training data, a growing body of recent research demonstrates that introducing (anti-)correlations in the noise can significantly improve privacy-utility trade-offs by carefully canceling out some of the noise added on earlier steps in subsequent steps. Such correlated noise mechanisms, known variously as matrix mechanisms, factorization mechanisms, and DP-Follow-the-Regularized-Leader (DP-FTRL) when applied to learning algorithms, have also been influential in practice, with industrial deployment at a global scale.
Abstract:Potential harms of large language models can be mitigated by watermarking model output, i.e., embedding signals into generated text that are invisible to humans but algorithmically detectable from a short span of tokens. We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality, and can be detected using an efficient open-source algorithm without access to the language model API or parameters. The watermark works by selecting a randomized set of "green" tokens before a word is generated, and then softly promoting use of green tokens during sampling. We propose a statistical test for detecting the watermark with interpretable p-values, and derive an information-theoretic framework for analyzing the sensitivity of the watermark. We test the watermark using a multi-billion parameter model from the Open Pretrained Transformer (OPT) family, and discuss robustness and security.
Abstract:In this paper we propose a game-theoretic model to analyze events similar to the 2009 \emph{DARPA Network Challenge}, which was organized by the Defense Advanced Research Projects Agency (DARPA) for exploring the roles that the Internet and social networks play in incentivizing wide-area collaborations. The challenge was to form a group that would be the first to find the locations of ten moored weather balloons across the United States. We consider a model in which $N$ people (who can form groups) are located in some topology with a fixed coverage volume around each person's geographical location. We consider various topologies where the players can be located such as the Euclidean $d$-dimension space and the vertices of a graph. A balloon is placed in the space and a group wins if it is the first one to report the location of the balloon. A larger team has a higher probability of finding the balloon, but we assume that the prize money is divided equally among the team members. Hence there is a competing tension to keep teams as small as possible. \emph{Risk aversion} is the reluctance of a person to accept a bargain with an uncertain payoff rather than another bargain with a more certain, but possibly lower, expected payoff. In our model we consider the \emph{isoelastic} utility function derived from the Arrow-Pratt measure of relative risk aversion. The main aim is to analyze the structures of the groups in Nash equilibria for our model. For the $d$-dimensional Euclidean space ($d\geq 1$) and the class of bounded degree regular graphs we show that in any Nash Equilibrium the \emph{richest} group (having maximum expected utility per person) covers a constant fraction of the total volume.