Digital twins are emerging in many industries, typically consisting of simulation models and data associated with a specific physical system. One of the main reasons for developing a digital twin, is to enable the simulation of possible consequences of a given action, without the need to interfere with the physical system itself. Physical systems of interest, and the environments they operate in, do not always behave deterministically. Moreover, information about the system and its environment is typically incomplete or imperfect. Probabilistic representations of systems and environments may therefore be called for, especially to support decisions in application areas where actions may have severe consequences. In this paper we introduce the probabilistic digital twin (PDT). We will start by discussing how epistemic uncertainty can be treated using measure theory, by modelling epistemic information via $\sigma$-algebras. Based on this, we give a formal definition of how epistemic uncertainty can be updated in a PDT. We then study the problem of optimal sequential decision making. That is, we consider the case where the outcome of each decision may inform the next. Within the PDT framework, we formulate this optimization problem. We discuss how this problem may be solved (at least in theory) via the maximum principle method or the dynamic programming principle. However, due to the curse of dimensionality, these methods are often not tractable in practice. To mend this, we propose a generic approximate solution using deep reinforcement learning together with neural networks defined on sets. We illustrate the method on a practical problem, considering optimal information gathering for the estimation of a failure probability.
This paper presents an approach for constrained Gaussian Process (GP) regression where we assume that a set of linear transformations of the process are bounded. It is motivated by machine learning applications for high-consequence engineering systems, where this kind of information is often made available from phenomenological knowledge, and the resulting constraints may be essential to achieve the level of confidence needed. We consider a GP $f$ over functions on $\mathcal{X} \subset \mathbb{R}^{n}$ taking values in $\mathbb{R}$, where the process $\mathcal{L}f$ is still Gaussian when $\mathcal{L}$ is a linear operator. Our goal is to model $f$ under the constraint that realizations of $\mathcal{L}f$ are confined to a convex set of functions. In particular we require that $a \leq \mathcal{L}f \leq b$ given two functions $a$ and $b$ where $a < b$ pointwise. This formulation provides a consistent way of encoding multiple linear constraints, such as shape-constraints based on e.g. boundedness, monotonicity or convexity as a relevant example. We adopt the approach of using a sufficiently dense set of virtual observation locations where the constraint is required to hold, and derive the exact posterior for a conjugate likelihood. The results needed for stable numerical implementation are derived, together with an efficient sampling scheme for estimating the posterior process which is exact in the limit. A few numerical examples focusing on noiseless observations are given. This is relevant for computer code emulation and is also more computationally demanding than the alternative scenario with i.i.d. Gaussian noise.