Abstract:Accurate calculations of solvation free energies remain a central challenge in molecular simulations, often requiring extensive sampling and numerous alchemical intermediates to ensure sufficient overlap between phase-space distributions of a solute in the gas phase and in solution. Here, we introduce a computational framework based on normalizing flows that directly maps solvent configurations between solutes of different sizes, and compare the accuracy and efficiency to conventional free energy estimates. For a Lennard-Jones solvent, we demonstrate that this approach yields acceptable accuracy in estimating free energy differences for challenging transformations, such as solute growth or increased solute-solute separation, which typically demand multiple intermediate simulation steps along the transformation. Analysis of radial distribution functions indicates that the flow generates physically meaningful solvent rearrangements, substantially enhancing configurational overlap between states in configuration space. These results suggest flow-based models as a promising alternative to traditional free energy estimation methods.
Abstract:We review clustering as an analysis tool and the underlying concepts from an introductory perspective. What is clustering and how can clusterings be realised programmatically? How can data be represented and prepared for a clustering task? And how can clustering results be validated? Connectivity-based versus prototype-based approaches are reflected in the context of several popular methods: single-linkage, spectral embedding, k-means, and Gaussian mixtures are discussed as well as the density-based protocols (H)DBSCAN, Jarvis-Patrick, CommonNN, and density-peaks.