Abstract:Combinatory Homomorphic Automatic Differentiation (CHAD) was originally formulated as a semantics-driven source transformation for reverse-mode AD in total programming languages. We extend this framework to partial languages with features such as potentially non-terminating operations, real-valued conditionals, and iteration constructs like while-loops, while preserving CHAD's structure-preserving semantics principle. A key contribution is the introduction of iteration-extensive indexed categories, which allow iteration in the base category to lift to parameterized initial algebras in the indexed category. This enables iteration to be interpreted in the Grothendieck construction of the target language in a principled way. The resulting fibred iterative structure cleanly models iteration in the categorical semantics. Consequently, the extended CHAD transformation remains the unique structure-preserving functor (an iterative Freyd category morphism) from the freely generated iterative Freyd category of the source language to the Grothendieck construction of the target's syntactic semantics, mapping each primitive operation to its derivative. We prove the correctness of this transformation using the universal property of the source language's syntax, showing that the transformed programs compute correct reverse-mode derivatives. Our development also contributes to understanding iteration constructs within dependently typed languages and categories of containers. As our primary motivation and application, we generalize CHAD to languages with data types, partial features, and iteration, providing the first rigorous categorical semantics for reverse-mode CHAD in such settings and formally guaranteeing the correctness of the source-to-source CHAD technique.
Abstract:Probabilistic programming is perfectly suited to reliable and transparent data science, as it allows the user to specify their models in a high-level language without worrying about the complexities of how to fit the models. Static analysis of probabilistic programs presents even further opportunities for enabling a high-level style of programming, by automating time-consuming and error-prone tasks. We apply static analysis to probabilistic programs to automate large parts of two crucial model checking methods: Prior Predictive Checks and Simulation-Based Calibration. Our method transforms a probabilistic program specifying a density function into an efficient forward-sampling form. To achieve this transformation, we extract a factor graph from a probabilistic program using static analysis, generate a set of proposal directed acyclic graphs using a SAT solver, select a graph which will produce provably correct sampling code, then generate one or more sampling programs. We allow minimal user interaction to broaden the scope of application beyond what is possible with static analysis alone. We present an implementation targeting the popular Stan probabilistic programming language, automating large parts of a robust Bayesian workflow for a wide community of probabilistic programming users.