Abstract:The high cost of ownership of AI compute infrastructure and challenges of robust serving of large language models (LLMs) has led to a surge in managed Model-as-a-service deployments. Even when enterprises choose on-premises deployments, the compute infrastructure is typically shared across many teams in order to maximize the return on investment. In both scenarios the deployed models operate only on plaintext data, and so enterprise data owners must allow their data to appear in plaintext on a shared or multi-tenant compute infrastructure. This results in data owners with private or sensitive data being hesitant or restricted in what data they use with these types of deployments. In this work we introduce the Stained Glass Transform, a learned, stochastic, and sequence dependent transformation of the word embeddings of an LLM which information theoretically provides privacy to the input of the LLM while preserving the utility of model. We theoretically connect a particular class of Stained Glass Transforms to the theory of mutual information of Gaussian Mixture Models. We then calculate a-postiori privacy estimates, based on mutual information, and verify the privacy and utility of instances of transformed embeddings through token level metrics of privacy and standard LLM performance benchmarks.
Abstract:The sensitivity of heterogeneous energetic (HE) materials (propellants, explosives, and pyrotechnics) is critically dependent on their microstructure. Initiation of chemical reactions occurs at hot spots due to energy localization at sites of porosities and other defects. Emerging multi-scale predictive models of HE response to loads account for the physics at the meso-scale, i.e. at the scale of statistically representative clusters of particles and other features in the microstructure. Meso-scale physics is infused in machine-learned closure models informed by resolved meso-scale simulations. Since microstructures are stochastic, ensembles of meso-scale simulations are required to quantify hot spot ignition and growth and to develop models for microstructure-dependent energy deposition rates. We propose utilizing generative adversarial networks (GAN) to spawn ensembles of synthetic heterogeneous energetic material microstructures. The method generates qualitatively and quantitatively realistic microstructures by learning from images of HE microstructures. We show that the proposed GAN method also permits the generation of new morphologies, where the porosity distribution can be controlled and spatially manipulated. Such control paves the way for the design of novel microstructures to engineer HE materials for targeted performance in a materials-by-design framework.