The project of physics discovery is often equivalent to finding the most concise description of a physical system. The description with optimum predictive capability for a dataset generated by a physical system is one that minimizes both predictive error on the dataset and the complexity of the description. The discovery of the governing physics of a system can therefore be viewed as a mathematical optimization problem. We outline here a method to optimize the description of arbitrarily complex physical systems by minimizing the entropy of the description of the system. The Recursive Domain Partitioning (RDP) procedure finds the optimum partitioning of each physical domain into subdomains, and the optimum predictive function within each subdomain. Penalty functions are introduced to limit the complexity of the predictive function within each domain. Examples are shown in 1D and 2D. In 1D, the technique effectively discovers the elastic and plastic regions within a stress-strain curve generated by simulations of amorphous carbon material, while in 2D the technique discovers the free-flow region and the inertially-obstructed flow region in the simulation of fluid flow across a plate.
Molecular dynamics simulations produce data with complex nonlinear dynamics. If the timestep behavior of such a dynamic system can be represented by a linear operator, future states can be inferred directly without expensive simulations. The use of an autoencoder in combination with a physical timestep operator allows both the relevant structural characteristics of the molecular graphs and the underlying physics of the system to be isolated during the training process. In this work, we develop a pipeline for establishing graph-structured representations of time-series volumetric data from molecular dynamics simulations. We then train an autoencoder to find nonlinear mappings to a latent space where future timesteps can be predicted through application of a linear operator trained in tandem with the autoencoder. Increasing the dimensionality of the autoencoder output is shown to improve the accuracy of the physical timestep operator.