Accurately modeling the nonlinear dynamics of a system from measurement data is a challenging yet vital topic. The sparse identification of nonlinear dynamics (SINDy) algorithm is one approach to discover dynamical systems models from data. Although extensions have been developed to identify implicit dynamics, or dynamics described by rational functions, these extensions are extremely sensitive to noise. In this work, we develop SINDy-PI (parallel, implicit), a robust variant of the SINDy algorithm to identify implicit dynamics and rational nonlinearities. The SINDy-PI framework includes multiple optimization algorithms and a principled approach to model selection. We demonstrate the ability of this algorithm to learn implicit ordinary and partial differential equations and conservation laws from limited and noisy data. In particular, we show that the proposed approach is several orders of magnitude more noise robust than previous approaches, and may be used to identify a class of complex ODE and PDE dynamics that were previously unattainable with SINDy, including for the double pendulum dynamics and the Belousov Zhabotinsky (BZ) reaction.
We develop a deep autoencoder architecture that can be used to find a coordinate transformation which turns a nonlinear PDE into a linear PDE. Our architecture is motivated by the linearizing transformations provided by the Cole-Hopf transform for Burgers equation and the inverse scattering transform for completely integrable PDEs. By leveraging a residual network architecture, a near-identity transformation can be exploited to encode intrinsic coordinates in which the dynamics are linear. The resulting dynamics are given by a Koopman operator matrix $\mathbf{K}$. The decoder allows us to transform back to the original coordinates as well. Multiple time step prediction can be performed by repeated multiplication by the matrix $\mathbf{K}$ in the intrinsic coordinates. We demonstrate our method on a number of examples, including the heat equation and Burgers equation, as well as the substantially more challenging Kuramoto-Sivashinsky equation, showing that our method provides a robust architecture for discovering interpretable, linearizing transforms for nonlinear PDEs.
First principles modeling of physical systems has led to significant technological advances across all branches of science. For nonlinear systems, however, small modeling errors can lead to significant deviations from the true, measured behavior. Even in mechanical systems, where the equations are assumed to be well-known, there are often model discrepancies corresponding to nonlinear friction, wind resistance, etc. Discovering models for these discrepancies remains an open challenge for many complex systems. In this work, we use the sparse identification of nonlinear dynamics (SINDy) algorithm to discover a model for the discrepancy between a simplified model and measurement data. In particular, we assume that the model mismatch can be sparsely represented in a library of candidate model terms. We demonstrate the efficacy of our approach on several examples including experimental data from a double pendulum on a cart. We further design and implement a feed-forward controller in simulations, showing improvement with a discrepancy model.
Machine learning (ML) is redefining what is possible in data-intensive fields of science and engineering. However, applying ML to problems in the physical sciences comes with a unique set of challenges: scientists want physically interpretable models that can (i) generalize to predict previously unobserved behaviors, (ii) provide effective forecasting predictions (extrapolation), and (iii) be certifiable. Autonomous systems will necessarily interact with changing and uncertain environments, motivating the need for models that can accurately extrapolate based on physical principles (e.g. Newton's universal second law for classical mechanics, F=ma). Standard ML approaches have shown impressive performance for predicting dynamics in an interpolatory regime, but the resulting models often lack interpretability and fail to generalize. In this paper, we introduce a unified sparse optimization framework that learns governing dynamical systems models from data, selecting relevant terms in the dynamics from a library of possible functions. The resulting models are parsimonious, have physical interpretations, and can generalize to new parameter regimes. Our framework allows the use of non-convex sparsity promoting regularization functions and can be adapted to address key challenges in scientific problems and data sets, including outliers, parametric dependencies, and physical constraints. We show that the approach discovers parsimonious dynamical models on several example systems, including a spiking neuron model. This flexible approach can be tailored to the unique challenges associated with a wide range of applications and data sets, providing a powerful ML-based framework for learning governing models for physical systems from data.
Machine learning (ML) and artificial intelligence (AI) algorithms are now being used to automate the discovery of physics principles and governing equations from measurement data alone. However, positing a universal physical law from data is challenging without simultaneously proposing an accompanying discrepancy model to account for the inevitable mismatch between theory and measurements. By revisiting the classic problem of modeling falling objects of different size and mass, we highlight a number of subtle and nuanced issues that must be addressed by modern data-driven methods for the automated discovery of physics. Specifically, we show that measurement noise and complex secondary physical mechanisms, such as unsteady fluid drag forces, can obscure the underlying law of gravitation, leading to an erroneous model. Without proposing an appropriate discrepancy model to handle these drag forces, the data supports an Aristotelian, versus a Galilean, theory of gravitation. Using the sparse identification of nonlinear dynamics (SINDy) algorithm, with the additional assumption that each separate falling object is governed by the same physical law, we are able to identify a viable discrepancy model to account for the fluid dynamic forces that explain the mismatch between a posited universal law of gravity and the measurement data. This work highlights the fact that the simple application of ML/AI will generally be insufficient to extract universal physical laws without further modification.
The control of complex systems is of critical importance in many branches of science, engineering, and industry. Controlling an unsteady fluid flow is particularly important, as flow control is a key enabler for technologies in energy (e.g., wind, tidal, and combustion), transportation (e.g., planes, trains, and automobiles), security (e.g., tracking airborne contamination), and health (e.g., artificial hearts and artificial respiration). However, the high-dimensional, nonlinear, and multi-scale dynamics make real-time feedback control infeasible. Fortunately, these high-dimensional systems exhibit dominant, low-dimensional patterns of activity that can be exploited for effective control in the sense that knowledge of the entire state of a system is not required. Advances in machine learning have the potential to revolutionize flow control given its ability to extract principled, low-rank feature spaces characterizing such complex systems. We present a novel deep learning model predictive control (DeepMPC) framework that exploits low-rank features of the flow in order to achieve considerable improvements to control performance. Instead of predicting the entire fluid state, we use a recurrent neural network (RNN) to accurately predict the control relevant quantities of the system. The RNN is then embedded into a MPC framework to construct a feedback loop, and incoming sensor data is used to perform online updates to improve prediction accuracy. The results are validated using varying fluid flow examples of increasing complexity.
In many applications, it is important to reconstruct a fluid flow field, or some other high-dimensional state, from limited measurements and limited data. In this work, we propose a shallow neural network-based learning methodology for such fluid flow reconstruction. Our approach learns an end-to-end mapping between the sensor measurements and the high-dimensional fluid flow field, without any heavy preprocessing on the raw data. No prior knowledge is assumed to be available, and the estimation method is purely data-driven. We demonstrate the performance on three examples in fluid mechanics and oceanography, showing that this modern data-driven approach outperforms traditional modal approximation techniques which are commonly used for flow reconstruction. Not only does the proposed method show superior performance characteristics, it can also produce a comparable level of performance with traditional methods in the area, using significantly fewer sensors. Thus, the mathematical architecture is ideal for emerging global monitoring technologies where measurement data are often limited.
Softmax is a standard final layer used in Neural Nets (NNs) to summarize information encoded in the trained NN and return a prediction. However, Softmax leverages only a subset of the class-specific structure encoded in the trained model and ignores potentially valuable information: During training, models encode an array $D$ of class response distributions, where $D_{ij}$ is the distribution of the $j^{th}$ pre-Softmax readout neuron's responses to the $i^{th}$ class. Given a test sample, Softmax implicitly uses only the row of this array $D$ that corresponds to the readout neurons' responses to the sample's true class. Leveraging more of this array $D$ can improve classifier accuracy, because the likelihoods of two competing classes can be encoded in other rows of $D$. To explore this potential resource, we develop a hybrid classifier (Softmax-Pooling Hybrid, $SPH$) that uses Softmax on high-scoring samples, but on low-scoring samples uses a log-likelihood method that pools the information from the full array $D$. We apply $SPH$ to models trained on a vectorized MNIST dataset to varying levels of accuracy. $SPH$ replaces only the final Softmax layer in the trained NN, at test time only. All training is the same as for Softmax. Because the pooling classifier performs better than Softmax on low-scoring samples, $SPH$ reduces test set error by 6% to 23%, using the exact same trained model, whatever the baseline Softmax accuracy. This reduction in error reflects hidden capacity of the trained NN that is left unused by Softmax.
Conserved quantities, i.e. constants of motion, are critical for characterizing many dynamical systems in science and engineering. These quantities are related to underlying symmetries and they provide fundamental knowledge about physical laws, describe the evolution of the system, and enable system reduction. In this work, we formulate a data-driven architecture for discovering conserved quantities based on Koopman theory. The Koopman operator has emerged as a principled linear embedding of nonlinear dynamics, and its eigenfunctions establish intrinsic coordinates along which the dynamics behave linearly. Interestingly, eigenfunctions of the Koopman operator associated with vanishing eigenvalues correspond to conserved quantities of the underlying system. In this paper, we show that these invariants may be identified with data-driven regression and power series expansions, based on the infinitesimal generator of the Koopman operator. We further establish a connection between the Koopman framework, conserved quantities, and the Lie-Poisson bracket. This data-driven method for discovering conserved quantities is demonstrated on the three-dimensional rigid body equations, where we simultaneously discover the total energy and angular momentum and use these intrinsic coordinates to develop a model predictive controller to track a given reference value.
Regularized regression problems are ubiquitous in statistical modeling, signal processing, and machine learning. Sparse regression in particular has been instrumental in scientific model discovery, including compressed sensing applications, variable selection, and high-dimensional analysis. We propose a broad framework for sparse relaxed regularized regression, called SR3. The key idea is to solve a relaxation of the regularized problem, which has three advantages over the state-of-the-art: (1) solutions of the relaxed problem are superior with respect to errors, false positives, and conditioning, (2) relaxation allows extremely fast algorithms for both convex and nonconvex formulations, and (3) the methods apply to composite regularizers such as total variation (TV) and its nonconvex variants. We demonstrate the advantages of SR3 (computational efficiency, higher accuracy, faster convergence rates, greater flexibility) across a range of regularized regression problems with synthetic and real data, including applications in compressed sensing, LASSO, matrix completion, TV regularization, and group sparsity. To promote reproducible research, we also provide a companion Matlab package that implements these examples.