Abstract:Strategic mine production scheduling under geological uncertainty is conventionally formulated as a stochastic optimization problem in which a fixed extraction sequence and routing decisions are computed ex ante. This plan-driven paradigm treats uncertainty as passive: decisions are hedged across geological scenarios, but planning does not anticipate how future observations will inform future decisions. We propose a different perspective by formulating mine scheduling as a Partially Observable Markov Decision Process (POMDP), in which extraction and routing decisions are made sequentially with planning explicitly integrating the expectation of future belief updates. To achieve computational tractability, we introduce a hybrid SA-POMDP architecture that combines simulated annealing-based (SA) value approximation with ensemble-based belief updating via ensemble smoother with multiple data assimilation (ES-MDA). At each decision epoch, candidate actions are evaluated through their expected long-term value under the current belief, and the belief is updated as mining observations are assimilated. This yields an adaptive policy rather than a fixed plan. We evaluate the framework on a copper-gold open-pit mining complex with multiple processing destinations. Under a statistically consistent prior, the SA-POMDP reduces the expectation-reality gap from 22.3% to 4.6%, improving realized NPV by USD8.4M relative to one-shot stochastic optimization. Under systematic prior misspecification of 10%, the adaptive framework outperforms static planning by up to USD44.6M (36.9%), demonstrating structural robustness beyond scenario hedging. These results show that sequential belief updating transforms geological uncertainty from a passive constraint into an active component of value creation.




Abstract:Optimal Bayesian decision making on what geoscientific data to acquire requires stating a prior model of uncertainty. Data acquisition is then optimized by reducing uncertainty on some property of interest maximally, and on average. In the context of exploration, very few, sometimes no data at all, is available prior to data acquisition planning. The prior model therefore needs to include human interpretations on the nature of spatial variability, or on analogue data deemed relevant for the area being explored. In mineral exploration, for example, humans may rely on conceptual models on the genesis of the mineralization to define multiple hypotheses, each representing a specific spatial variability of mineralization. More often than not, after the data is acquired, all of the stated hypotheses may be proven incorrect, i.e. falsified, hence prior hypotheses need to be revised, or additional hypotheses generated. Planning data acquisition under wrong geological priors is likely to be inefficient since the estimated uncertainty on the target property is incorrect, hence uncertainty may not be reduced at all. In this paper, we develop an intelligent agent based on partially observable Markov decision processes that plans optimally in the case of multiple geological or geoscientific hypotheses on the nature of spatial variability. Additionally, the artificial intelligence is equipped with a method that allows detecting, early on, whether the human stated hypotheses are incorrect, thereby saving considerable expense in data acquisition. Our approach is tested on a sediment-hosted copper deposit, and the algorithm presented has aided in the characterization of an ultra high-grade deposit in Zambia in 2023.




Abstract:Real-world planning problems$\unicode{x2014}$including autonomous driving and sustainable energy applications like carbon storage and resource exploration$\unicode{x2014}$have recently been modeled as partially observable Markov decision processes (POMDPs) and solved using approximate methods. To solve high-dimensional POMDPs in practice, state-of-the-art methods use online planning with problem-specific heuristics to reduce planning horizons and make the problems tractable. Algorithms that learn approximations to replace heuristics have recently found success in large-scale problems in the fully observable domain. The key insight is the combination of online Monte Carlo tree search with offline neural network approximations of the optimal policy and value function. In this work, we bring this insight to partially observed domains and propose BetaZero, a belief-state planning algorithm for POMDPs. BetaZero learns offline approximations based on accurate belief models to enable online decision making in long-horizon problems. We address several challenges inherent in large-scale partially observable domains; namely challenges of transitioning in stochastic environments, prioritizing action branching with limited search budget, and representing beliefs as input to the network. We apply BetaZero to various well-established benchmark POMDPs found in the literature. As a real-world case study, we test BetaZero on the high-dimensional geological problem of critical mineral exploration. Experiments show that BetaZero outperforms state-of-the-art POMDP solvers on a variety of tasks.