We introduce a nonlinear stochastic model reduction technique for high-dimensional stochastic dynamical systems that have a low-dimensional invariant effective manifold with slow dynamics, and high-dimensional, large fast modes. Given only access to a black box simulator from which short bursts of simulation can be obtained, we estimate the invariant manifold, a process of the effective (stochastic) dynamics on it, and construct an efficient simulator thereof. These estimation steps can be performed on-the-fly, leading to efficient exploration of the effective state space, without losing consistency with the underlying dynamics. This construction enables fast and efficient simulation of paths of the effective dynamics, together with estimation of crucial features and observables of such dynamics, including the stationary distribution, identification of metastable states, and residence times and transition rates between them.
Efficient simulation of SDEs is essential in many applications, particularly for ergodic systems that demand efficient simulation of both short-time dynamics and large-time statistics. However, locally Lipschitz SDEs often require special treatments such as implicit schemes with small time-steps to accurately simulate the ergodic measure. We introduce a framework to construct inference-based schemes adaptive to large time-steps (ISALT) from data, achieving a reduction in time by several orders of magnitudes. The key is the statistical learning of an approximation to the infinite-dimensional discrete-time flow map. We explore the use of numerical schemes (such as the Euler-Maruyama, a hybrid RK4, and an implicit scheme) to derive informed basis functions, leading to a parameter inference problem. We introduce a scalable algorithm to estimate the parameters by least squares, and we prove the convergence of the estimators as data size increases. We test the ISALT on three non-globally Lipschitz SDEs: the 1D double-well potential, a 2D multi-scale gradient system, and the 3D stochastic Lorenz equation with degenerate noise. Numerical results show that ISALT can tolerate time-step magnitudes larger than plain numerical schemes. It reaches optimal accuracy in reproducing the invariant measure when the time-step is medium-large.
Inference in hidden Markov model has been challenging in terms of scalability due to dependencies in the observation data. In this paper, we utilize the inherent memory decay in hidden Markov models, such that the forward and backward probabilities can be carried out with subsequences, enabling efficient inference over long sequences of observations. We formulate this forward filtering process in the setting of the random dynamical system and there exist Lyapunov exponents in the i.i.d random matrices production. And the rate of the memory decay is known as $\lambda_2-\lambda_1$, the gap of the top two Lyapunov exponents almost surely. An efficient and accurate algorithm is proposed to numerically estimate the gap after the soft-max parametrization. The length of subsequences $B$ given the controlled error $\epsilon$ is $B=\log(\epsilon)/(\lambda_2-\lambda_1)$. We theoretically prove the validity of the algorithm and demonstrate the effectiveness with numerical examples. The method developed here can be applied to widely used algorithms, such as mini-batch stochastic gradient method. Moreover, the continuity of Lyapunov spectrum ensures the estimated $B$ could be reused for the nearby parameter during the inference.