Partially Observable Markov Decision Processes (POMDPs) can model complex sequential decision-making problems under stochastic and uncertain environments. A main reason hindering their broad adoption in real-world applications is the lack of availability of a suitable POMDP model or a simulator thereof. Available solution algorithms, such as Reinforcement Learning (RL), require the knowledge of the transition dynamics and the observation generating process, which are often unknown and non-trivial to infer. In this work, we propose a combined framework for inference and robust solution of POMDPs via deep RL. First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model, which is conditioned on actions, in order to recover full posterior distributions from the available data. The POMDP with uncertain parameters is then solved via deep RL techniques with the parameter distributions incorporated into the solution via domain randomization, in order to develop solutions that are robust to model uncertainty. As a further contribution, we compare the use of transformers and long short-term memory networks, which constitute model-free RL solutions, with a model-based/model-free hybrid approach. We apply these methods to the real-world problem of optimal maintenance planning for railway assets.
Structural Health Monitoring (SHM) describes a process for inferring quantifiable metrics of structural condition, which can serve as input to support decisions on the operation and maintenance of infrastructure assets. Given the long lifespan of critical structures, this problem can be cast as a sequential decision making problem over prescribed horizons. Partially Observable Markov Decision Processes (POMDPs) offer a formal framework to solve the underlying optimal planning task. However, two issues can undermine the POMDP solutions. Firstly, the need for a model that can adequately describe the evolution of the structural condition under deterioration or corrective actions and, secondly, the non-trivial task of recovery of the observation process parameters from available monitoring data. Despite these potential challenges, the adopted POMDP models do not typically account for uncertainty on model parameters, leading to solutions which can be unrealistically confident. In this work, we address both key issues. We present a framework to estimate POMDP transition and observation model parameters directly from available data, via Markov Chain Monte Carlo (MCMC) sampling of a Hidden Markov Model (HMM) conditioned on actions. The MCMC inference estimates distributions of the involved model parameters. We then form and solve the POMDP problem by exploiting the inferred distributions, to derive solutions that are robust to model uncertainty. We successfully apply our approach on maintenance planning for railway track assets on the basis of a "fractal value" indicator, which is computed from actual railway monitoring data.
Accurate structural response prediction forms a main driver for structural health monitoring and control applications. This often requires the proposed model to adequately capture the underlying dynamics of complex structural systems. In this work, we utilize a learnable Extended Kalman Filter (EKF), named the Neural Extended Kalman Filter (Neural EKF) throughout this paper, for learning the latent evolution dynamics of complex physical systems. The Neural EKF is a generalized version of the conventional EKF, where the modeling of process dynamics and sensory observations can be parameterized by neural networks, therefore learned by end-to-end training. The method is implemented under the variational inference framework with the EKF conducting inference from sensing measurements. Typically, conventional variational inference models are parameterized by neural networks independent of the latent dynamics models. This characteristic makes the inference and reconstruction accuracy weakly based on the dynamics models and renders the associated training inadequate. We here show how the structure imposed by the Neural EKF is beneficial to the learning process. We demonstrate the efficacy of the framework on both simulated and real-world monitoring datasets, with the results indicating significant predictive capabilities of the proposed scheme.
The order/dimension of models derived on the basis of data is commonly restricted by the number of observations, or in the context of monitored systems, sensing nodes. This is particularly true for structural systems (e.g. civil or mechanical structures), which are typically high-dimensional in nature. In the scope of physics-informed machine learning, this paper proposes a framework - termed Neural Modal ODEs - to integrate physics-based modeling with deep learning (particularly, Neural Ordinary Differential Equations -- Neural ODEs) for modeling the dynamics of monitored and high-dimensional engineered systems. In this initiating exploration, we restrict ourselves to linear or mildly nonlinear systems. We propose an architecture that couples a dynamic version of variational autoencoders with physics-informed Neural ODEs (Pi-Neural ODEs). An encoder, as a part of the autoencoder, learns the abstract mappings from the first few items of observational data to the initial values of the latent variables, which drive the learning of embedded dynamics via physics-informed Neural ODEs, imposing a \textit{modal model} structure to that latent space. The decoder of the proposed model adopts the eigenmodes derived from an eigen-analysis applied to the linearized portion of a physics-based model: a process implicitly carrying the spatial relationship between degrees-of-freedom (DOFs). The framework is validated on a numerical example, and an experimental dataset of a scaled cable-stayed bridge, where the learned hybrid model is shown to outperform a purely physics-based approach to modeling. We further show the functionality of the proposed scheme within the context of virtual sensing, i.e., the recovery of generalized response quantities in unmeasured DOFs from spatially sparse data.
The need for algorithms able to solve Reinforcement Learning (RL) problems with few trials has motivated the advent of model-based RL methods. The reported performance of model-based algorithms has dramatically increased within recent years. However, it is not clear how much of the recent progress is due to improved algorithms or due to improved models. While different modeling options are available to choose from when applying a model-based approach, the distinguishing traits and particular strengths of different models are not clear. The main contribution of this work lies precisely in assessing the model influence on the performance of RL algorithms. A set of commonly adopted models is established for the purpose of model comparison. These include Neural Networks (NNs), ensembles of NNs, two different approximations of Bayesian NNs (BNNs), that is, the Concrete Dropout NN and the Anchored Ensembling, and Gaussian Processes (GPs). The model comparison is evaluated on a suite of continuous control benchmarking tasks. Our results reveal that significant differences in model performance do exist. The Concrete Dropout NN reports persistently superior performance. We summarize these differences for the benefit of the modeler and suggest that the model choice is tailored to the standards required by each specific application.
In this paper, we propose a probabilistic physics-guided framework, termed Physics-guided Deep Markov Model (PgDMM). The framework is especially targeted to the inference of the characteristics and latent structure of nonlinear dynamical systems from measurement data, where it is typically intractable to perform exact inference of latent variables. A recently surfaced option pertains to leveraging variational inference to perform approximate inference. In such a scheme, transition and emission functions of the system are parameterized via feed-forward neural networks (deep generative models). However, due to the generalized and highly versatile formulation of neural network functions, the learned latent space is often prone to lack physical interpretation and structured representation. To address this, we bridge physics-based state space models with Deep Markov Models, thus delivering a hybrid modeling framework for unsupervised learning and identification for nonlinear dynamical systems. Specifically, the transition process can be modeled as a physics-based model enhanced with an additive neural network component, which aims to learn the discrepancy between the physics-based model and the actual dynamical system being monitored. The proposed framework takes advantage of the expressive power of deep learning, while retaining the driving physics of the dynamical system by imposing physics-driven restrictions on the side of the latent space. We demonstrate the benefits of such a fusion in terms of achieving improved performance on illustrative simulation examples and experimental case studies of nonlinear systems. Our results indicate that the physics-based models involved in the employed transition and emission functions essentially enforce a more structured and physically interpretable latent space, which is essential to generalization and prediction capabilities.
Graph Networks (GNs) enable the fusion of prior knowledge and relational reasoning with flexible function approximations. In this work, a general GN-based model is proposed which takes full advantage of the relational modeling capabilities of GNs and extends these to probabilistic modeling with Variational Bayes (VB). To that end, we combine complementary pre-existing approaches on VB for graph data and propose an approach that relies on graph-structured latent and conditioning variables. It is demonstrated that Neural Processes can also be viewed through the lens of the proposed model. We show applications on the problem of structured probability density modeling for simulated and real wind farm monitoring data, as well as on the meta-learning of simulated Gaussian Process data. We release the source code, along with the simulated datasets.
One of the requirements of the population-based approach to Structural Health Monitoring (SHM) proposed in the earlier papers in this sequence, is that structures be represented by points in an abstract space. Furthermore, these spaces should be metric spaces in a loose sense; i.e. there should be some measure of distance applicable to pairs of points; similar structures should then be close in the metric. However, this geometrical construction is not enough for the framing of problems in data-based SHM, as it leaves undefined the notion of feature spaces. Interpreting the feature values on a structure-by-structure basis as a type of field over the space of structures, it seems sensible to borrow an idea from modern theoretical physics, and define feature assignments as sections in a vector bundle over the structure space. With this idea in place, one can interpret the effect of environmental and operational variations as gauge degrees of freedom, as in modern gauge field theories. This paper will discuss the various geometrical structures required for an abstract theory of feature spaces in SHM, and will draw analogies with how these structures have shown their power in modern physics. In the second part of the paper, the problem of determining the normal condition cross section of a feature bundle is addressed. The solution is provided by the application of Graph Neural Networks (GNN), a versatile non-Euclidean machine learning algorithm which is not restricted to inputs and outputs from vector spaces. In particular, the algorithm is well suited to operating directly on the sort of graph structures which are an important part of the proposed framework for PBSHM. The solution of the normal section problem is demonstrated for a heterogeneous population of truss structures for which the feature of interest is the first natural frequency.
In this work, a novel approach for the construction and training of time series models is presented that deals with the problem of learning on large time series with non-equispaced observations, which at the same time may possess features of interest that span multiple scales. The proposed method is appropriate for constructing predictive models for non-stationary stochastic time series.The efficacy of the method is demonstrated on a simulated stochastic degradation dataset and on a real-world accelerated life testing dataset for ball-bearings. The proposed method, which is based on GraphNets, implicitly learns a model that describes the evolution of the system at the level of a state-vector rather than of a raw observation. The proposed approach is compared to a recurrent network with a temporal convolutional feature extractor head (RNN-tCNN) which forms a known viable alternative for the problem context considered. Finally, by taking advantage of recent advances in the computation of reparametrization gradients for learning probability distributions, a simple yet effective technique for representing prediction uncertainty as a Gamma distribution over remaining useful life predictions is employed.