Abstract:The widespread success of foundation models in natural language processing and computer vision has inspired researchers to extend the concept to scientific machine learning and computational science. However, this position paper argues that as the term "foundation model" is an evolving concept, its application in computational science is increasingly used without a universally accepted definition, potentially creating confusion and diluting its precise scientific meaning. In this paper, we address this gap by proposing a formal definition of foundation models in computational science, grounded in the core values of generality, reusability, and scalability. We articulate a set of essential and desirable characteristics that such models must exhibit, drawing parallels with traditional foundational methods, like the finite element and finite volume methods. Furthermore, we introduce the Data-Driven Finite Element Method (DD-FEM), a framework that fuses the modular structure of classical FEM with the representational power of data-driven learning. We demonstrate how DD-FEM addresses many of the key challenges in realizing foundation models for computational science, including scalability, adaptability, and physics consistency. By bridging traditional numerical methods with modern AI paradigms, this work provides a rigorous foundation for evaluating and developing novel approaches toward future foundation models in computational science.
Abstract:Data-driven closures correct the standard reduced order models (ROMs) to increase their accuracy in under-resolved, convection-dominated flows. There are two types of data-driven ROM closures in current use: (i) structural, with simple ansatzes (e.g., linear or quadratic); and (ii) machine learning-based, with neural network ansatzes. We propose a novel symbolic regression (SR) data-driven ROM closure strategy, which combines the advantages of current approaches and eliminates their drawbacks. As a result, the new data-driven SR closures yield ROMs that are interpretable, parsimonious, accurate, generalizable, and robust. To compare the data-driven SR-ROM closures with the structural and machine learning-based ROM closures, we consider the data-driven variational multiscale ROM framework and two under-resolved, convection-dominated test problems: the flow past a cylinder and the lid-driven cavity flow at Reynolds numbers Re = 10000, 15000, and 20000. This numerical investigation shows that the new data-driven SR-ROM closures yield more accurate and robust ROMs than the structural and machine learning ROM closures.