Abstract:Uncertainty quantification for image data is dominated by complex deep learning methods, yet the field lacks an interpretable, mathematically grounded baseline. We propose Bayesian scattering to fill this gap, serving as a first-step baseline akin to the role of Bayesian linear regression for tabular data. Our method couples the wavelet scattering transform-a deep, non-learned feature extractor-with a simple probabilistic head. Because scattering features are derived from geometric principles rather than learned, they avoid overfitting the training distribution. This helps provide sensible uncertainty estimates even under significant distribution shifts. We validate this on diverse tasks, including medical imaging under institution shift, wealth mapping under country-to-country shift, and Bayesian optimization of molecular properties. Our results suggest that Bayesian scattering is a solid baseline for complex uncertainty quantification methods.
Abstract:Cyclic molecules are ubiquitous across applications in chemistry and biology. Their restricted conformational flexibility provides structural pre-organization that is key to their function in drug discovery and catalysis. However, reliably sampling the conformer ensembles of ring systems remains challenging. Here, we introduce PuckerFlow, a generative machine learning model that performs flow matching on the Cremer-Pople space, a low-dimensional internal coordinate system capturing the relevant degrees of freedom of rings. Our approach enables generation of valid closed rings by design and demonstrates strong performance in generating conformers that are both diverse and precise. We show that PuckerFlow outperforms other conformer generation methods on nearly all quantitative metrics and illustrate the potential of PuckerFlow for ring systems relevant to chemical applications, particularly in catalysis and drug discovery. This work enables efficient and reliable conformer generation of cyclic structures, paving the way towards modeling structure-property relationships and the property-guided generation of rings across a wide range of applications in chemistry and biology.
Abstract:General parameters are highly desirable in the natural sciences - e.g., chemical reaction conditions that enable high yields across a range of related transformations. This has a significant practical impact since those general parameters can be transferred to related tasks without the need for laborious and time-intensive re-optimization. While Bayesian optimization (BO) is widely applied to find optimal parameter sets for specific tasks, it has remained underused in experiment planning towards such general optima. In this work, we consider the real-world problem of condition optimization for chemical reactions to study how performing generality-oriented BO can accelerate the identification of general optima, and whether these optima also translate to unseen examples. This is achieved through a careful formulation of the problem as an optimization over curried functions, as well as systematic evaluations of generality-oriented strategies for optimization tasks on real-world experimental data. We find that for generality-oriented optimization, simple myopic optimization strategies that decouple parameter and task selection perform comparably to more complex ones, and that effective optimization is merely determined by an effective exploration of both parameter and task space.