Abstract:Machine learning methods often struggle with real-world applications in science and engineering due to limited or low-quality training data. In this work, the example of groundwater flow with heat transport is considered; this corresponds to an advection-diffusion process under heterogeneous flow conditions, that is, spatially distributed material parameters and heat sources. Classical numerical simulations are costly and challenging due to high spatio-temporal resolution requirements and large domains. While often computationally more efficient, purely data-driven surrogate models face difficulties, particularly in predicting the advection process, which is highly sensitive to input variations and involves long-range spatial interactions. Therefore, in this work, a Local-Global Convolutional Neural Network (LGCNN) approach is introduced. It combines a lightweight numerical surrogate for the transport process (global) with convolutional neural networks for the groundwater velocity and heat diffusion processes (local). With the LGCNN, a city-wide subsurface temperature field is modeled, involving a heterogeneous groundwater flow field and one hundred groundwater heat pump injection points forming interacting heat plumes over long distances. The model is first systematically analyzed based on random subsurface input fields. Then, the model is trained on a handful of cut-outs from a real-world subsurface map of the Munich region in Germany, and it scales to larger cut-outs without retraining. All datasets, our code, and trained models are published for reproducibility.