Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pu Ren

Learning, Solving and Optimizing PDEs with TensorGalerkin: an efficient high-performance Galerkin assembly algorithm

Feb 04, 2026

Shizheng Wen, Mingyuan Chi, Tianwei Yu, Ben Moseley, Mike Yan Michelis, Pu Ren, Hao Sun, Siddhartha Mishra

Abstract:We present a unified algorithmic framework for the numerical solution, constrained optimization, and physics-informed learning of PDEs with a variational structure. Our framework is based on a Galerkin discretization of the underlying variational forms, and its high efficiency stems from a novel highly-optimized and GPU-compliant TensorGalerkin framework for linear system assembly (stiffness matrices and load vectors). TensorGalerkin operates by tensorizing element-wise operations within a Python-level Map stage and then performs global reduction with a sparse matrix multiplication that performs message passing on the mesh-induced sparsity graph. It can be seamlessly employed downstream as i) a highly-efficient numerical PDEs solver, ii) an end-to-end differentiable framework for PDE-constrained optimization, and iii) a physics-informed operator learning algorithm for PDEs. With multiple benchmarks, including 2D and 3D elliptic, parabolic, and hyperbolic PDEs on unstructured meshes, we demonstrate that the proposed framework provides significant computational efficiency and accuracy gains over a variety of baselines in all the targeted downstream applications.

Via

Access Paper or Ask Questions

Modeling Non-Ergodic Path Effects Using Conditional Generative Model for Fourier Amplitude Spectra

Dec 22, 2025

Maxime Lacour, Pu Ren, Rie Nakata, Nori Nakata, Michael Mahoney

Abstract:Recent developments in non-ergodic ground-motion models (GMMs) explicitly model systematic spatial variations in source, site, and path effects, reducing standard deviation to 30-40% of ergodic models and enabling more accurate site-specific seismic hazard analysis. Current non-ergodic GMMs rely on Gaussian Process (GP) methods with prescribed correlation functions and thus have computational limitations for large-scale predictions. This study proposes a deep-learning approach called Conditional Generative Modeling for Fourier Amplitude Spectra (CGM-FAS) as an alternative to GP-based methods for modeling non-ergodic path effects in Fourier Amplitude Spectra (FAS). CGM-FAS uses a Conditional Variational Autoencoder architecture to learn spatial patterns and interfrequency correlation directly from data by using geographical coordinates of earthquakes and stations as conditional variables. Using San Francisco Bay Area earthquake data, we compare CGM-FAS against a recent GP-based GMM for the region and demonstrate consistent predictions of non-ergodic path effects. Additionally, CGM-FAS offers advantages compared to GP-based approaches in learning spatial patterns without prescribed correlation functions, capturing interfrequency correlations, and enabling rapid predictions, generating maps for 10,000 sites across 1,000 frequencies within 10 seconds using a few GB of memory. CGM-FAS hyperparameters can be tuned to ensure generated path effects exhibit variability consistent with the GP-based empirical GMM. This work demonstrates a promising direction for efficient non-ergodic ground-motion prediction across multiple frequencies and large spatial domains.

Via

Access Paper or Ask Questions

Conservation-informed Graph Learning for Spatiotemporal Dynamics Prediction

Dec 30, 2024

Yuan Mi, Pu Ren, Hongteng Xu, Hongsheng Liu, Zidong Wang, Yike Guo, Ji-Rong Wen, Hao Sun, Yang Liu

Figure 1 for Conservation-informed Graph Learning for Spatiotemporal Dynamics Prediction

Figure 2 for Conservation-informed Graph Learning for Spatiotemporal Dynamics Prediction

Figure 3 for Conservation-informed Graph Learning for Spatiotemporal Dynamics Prediction

Figure 4 for Conservation-informed Graph Learning for Spatiotemporal Dynamics Prediction

Abstract:Data-centric methods have shown great potential in understanding and predicting spatiotemporal dynamics, enabling better design and control of the object system. However, pure deep learning models often lack interpretability, fail to obey intrinsic physics, and struggle to cope with the various domains. While geometry-based methods, e.g., graph neural networks (GNNs), have been proposed to further tackle these challenges, they still need to find the implicit physical laws from large datasets and rely excessively on rich labeled data. In this paper, we herein introduce the conservation-informed GNN (CiGNN), an end-to-end explainable learning framework, to learn spatiotemporal dynamics based on limited training data. The network is designed to conform to the general conservation law via symmetry, where conservative and non-conservative information passes over a multiscale space enhanced by a latent temporal marching strategy. The efficacy of our model has been verified in various spatiotemporal systems based on synthetic and real-world datasets, showing superiority over baseline models. Results demonstrate that CiGNN exhibits remarkable accuracy and generalization ability, and is readily applicable to learning for prediction of various spatiotemporal dynamics in a spatial domain with complex geometry.

Via

Access Paper or Ask Questions

P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics

Oct 29, 2024

Qi Wang, Pu Ren, Hao Zhou, Xin-Yang Liu, Zhiwen Deng, Yi Zhang, Ruizhi Chengze, Hongsheng Liu, Zidong Wang, Jian-Xun Wang(+3 more)

Figure 1 for P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics

Figure 2 for P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics

Figure 3 for P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics

Figure 4 for P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics

Abstract:When solving partial differential equations (PDEs), classical numerical methods often require fine mesh grids and small time stepping to meet stability, consistency, and convergence conditions, leading to high computational cost. Recently, machine learning has been increasingly utilized to solve PDE problems, but they often encounter challenges related to interpretability, generalizability, and strong dependency on rich labeled data. Hence, we introduce a new PDE-Preserved Coarse Correction Network (P$^2$C$^2$Net) to efficiently solve spatiotemporal PDE problems on coarse mesh grids in small data regimes. The model consists of two synergistic modules: (1) a trainable PDE block that learns to update the coarse solution (i.e., the system state), based on a high-order numerical scheme with boundary condition encoding, and (2) a neural network block that consistently corrects the solution on the fly. In particular, we propose a learnable symmetric Conv filter, with weights shared over the entire model, to accurately estimate the spatial derivatives of PDE based on the neural-corrected system state. The resulting physics-encoded model is capable of handling limited training data (e.g., 3--5 trajectories) and accelerates the prediction of PDE solutions on coarse spatiotemporal grids while maintaining a high accuracy. P$^2$C$^2$Net achieves consistent state-of-the-art performance with over 50\% gain (e.g., in terms of relative prediction error) across four datasets covering complex reaction-diffusion processes and turbulent flows.

Via

Access Paper or Ask Questions

Model Balancing Helps Low-data Training and Fine-tuning

Oct 16, 2024

Zihang Liu, Yuanzhe Hu, Tianyu Pang, Yefan Zhou, Pu Ren, Yaoqing Yang

Figure 1 for Model Balancing Helps Low-data Training and Fine-tuning

Figure 2 for Model Balancing Helps Low-data Training and Fine-tuning

Figure 3 for Model Balancing Helps Low-data Training and Fine-tuning

Figure 4 for Model Balancing Helps Low-data Training and Fine-tuning

Abstract:Recent advances in foundation models have emphasized the need to align pre-trained models with specialized domains using small, curated datasets. Studies on these foundation models underscore the importance of low-data training and fine-tuning. This topic, well-known in natural language processing (NLP), has also gained increasing attention in the emerging field of scientific machine learning (SciML). To address the limitations of low-data training and fine-tuning, we draw inspiration from Heavy-Tailed Self-Regularization (HT-SR) theory, analyzing the shape of empirical spectral densities (ESDs) and revealing an imbalance in training quality across different model layers. To mitigate this issue, we adapt a recently proposed layer-wise learning rate scheduler, TempBalance, which effectively balances training quality across layers and enhances low-data training and fine-tuning for both NLP and SciML tasks. Notably, TempBalance demonstrates increasing performance gains as the amount of available tuning data decreases. Comparative analyses further highlight the effectiveness of TempBalance and its adaptability as an "add-on" method for improving model performance.

* EMNLP 2024 Oral. First two authors contributed equally

Via

Access Paper or Ask Questions

Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling

Jul 21, 2024

Pu Ren, Rie Nakata, Maxime Lacour, Ilan Naiman, Nori Nakata, Jialin Song, Zhengfa Bi, Osman Asif Malik, Dmitriy Morozov, Omri Azencot(+2 more)

Figure 1 for Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling

Figure 2 for Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling

Figure 3 for Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling

Figure 4 for Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling

Abstract:Predicting high-fidelity ground motions for future earthquakes is crucial for seismic hazard assessment and infrastructure resilience. Conventional empirical simulations suffer from sparse sensor distribution and geographically localized earthquake locations, while physics-based methods are computationally intensive and require accurate representations of Earth structures and earthquake sources. We propose a novel artificial intelligence (AI) simulator, Conditional Generative Modeling for Ground Motion (CGM-GM), to synthesize high-frequency and spatially continuous earthquake ground motion waveforms. CGM-GM leverages earthquake magnitudes and geographic coordinates of earthquakes and sensors as inputs, learning complex wave physics and Earth heterogeneities, without explicit physics constraints. This is achieved through a probabilistic autoencoder that captures latent distributions in the time-frequency domain and variational sequential models for prior and posterior distributions. We evaluate the performance of CGM-GM using small-magnitude earthquake records from the San Francisco Bay Area, a region with high seismic risks. CGM-GM demonstrates a strong potential for outperforming a state-of-the-art non-ergodic empirical ground motion model and shows great promise in seismology and beyond.

Via

Access Paper or Ask Questions

WaveCastNet: An AI-enabled Wavefield Forecasting Framework for Earthquake Early Warning

May 30, 2024

Dongwei Lyu, Rie Nakata, Pu Ren, Michael W. Mahoney, Arben Pitarka, Nori Nakata, N. Benjamin Erichson

Figure 1 for WaveCastNet: An AI-enabled Wavefield Forecasting Framework for Earthquake Early Warning

Figure 2 for WaveCastNet: An AI-enabled Wavefield Forecasting Framework for Earthquake Early Warning

Figure 3 for WaveCastNet: An AI-enabled Wavefield Forecasting Framework for Earthquake Early Warning

Figure 4 for WaveCastNet: An AI-enabled Wavefield Forecasting Framework for Earthquake Early Warning

Abstract:Large earthquakes can be destructive and quickly wreak havoc on a landscape. To mitigate immediate threats, early warning systems have been developed to alert residents, emergency responders, and critical infrastructure operators seconds to a minute before seismic waves arrive. These warnings provide time to take precautions and prevent damage. The success of these systems relies on fast, accurate predictions of ground motion intensities, which is challenging due to the complex physics of earthquakes, wave propagation, and their intricate spatial and temporal interactions. To improve early warning, we propose a novel AI-enabled framework, WaveCastNet, for forecasting ground motions from large earthquakes. WaveCastNet integrates a novel convolutional Long Expressive Memory (ConvLEM) model into a sequence to sequence (seq2seq) forecasting framework to model long-term dependencies and multi-scale patterns in both space and time. WaveCastNet, which shares weights across spatial and temporal dimensions, requires fewer parameters compared to more resource-intensive models like transformers and thus, in turn, reduces inference times. Importantly, WaveCastNet also generalizes better than transformer-based models to different seismic scenarios, including to more rare and critical situations with higher magnitude earthquakes. Our results using simulated data from the San Francisco Bay Area demonstrate the capability to rapidly predict the intensity and timing of destructive ground motions. Importantly, our proposed approach does not require estimating earthquake magnitudes and epicenters, which are prone to errors using conventional approaches; nor does it require empirical ground motion models, which fail to capture strongly heterogeneous wave propagation effects.

Via

Access Paper or Ask Questions

Reasoning-Enhanced Object-Centric Learning for Videos

Mar 22, 2024

Jian Li, Pu Ren, Yang Liu, Hao Sun

Figure 1 for Reasoning-Enhanced Object-Centric Learning for Videos

Figure 2 for Reasoning-Enhanced Object-Centric Learning for Videos

Figure 3 for Reasoning-Enhanced Object-Centric Learning for Videos

Figure 4 for Reasoning-Enhanced Object-Centric Learning for Videos

Abstract:Object-centric learning aims to break down complex visual scenes into more manageable object representations, enhancing the understanding and reasoning abilities of machine learning systems toward the physical world. Recently, slot-based video models have demonstrated remarkable proficiency in segmenting and tracking objects, but they overlook the importance of the effective reasoning module. In the real world, reasoning and predictive abilities play a crucial role in human perception and object tracking; in particular, these abilities are closely related to human intuitive physics. Inspired by this, we designed a novel reasoning module called the Slot-based Time-Space Transformer with Memory buffer (STATM) to enhance the model's perception ability in complex scenes. The memory buffer primarily serves as storage for slot information from upstream modules, the Slot-based Time-Space Transformer makes predictions through slot-based spatiotemporal attention computations and fusion. Our experiment results on various datasets show that STATM can significantly enhance object-centric learning capabilities of slot-based video models.

Via

Access Paper or Ask Questions

Physics-Informed Machine Learning for Seismic Response Prediction OF Nonlinear Steel Moment Resisting Frame Structures

Mar 01, 2024

R. Bailey Bond, Pu Ren, Jerome F. Hajjar, Hao Sun

Figure 1 for Physics-Informed Machine Learning for Seismic Response Prediction OF Nonlinear Steel Moment Resisting Frame Structures

Figure 2 for Physics-Informed Machine Learning for Seismic Response Prediction OF Nonlinear Steel Moment Resisting Frame Structures

Figure 3 for Physics-Informed Machine Learning for Seismic Response Prediction OF Nonlinear Steel Moment Resisting Frame Structures

Figure 4 for Physics-Informed Machine Learning for Seismic Response Prediction OF Nonlinear Steel Moment Resisting Frame Structures

Abstract:There is a growing interest in utilizing machine learning (ML) methods for structural metamodeling due to the substantial computational cost of traditional numerical simulations. The existing data-driven strategies show potential limitations to the model robustness and interpretability as well as the dependency of rich data. To address these challenges, this paper presents a novel physics-informed machine learning (PiML) method, which incorporates scientific principles and physical laws into deep neural networks for modeling seismic responses of nonlinear structures. The basic concept is to constrain the solution space of the ML model within known physical bounds. This is made possible with three main features, namely, model order reduction, a long short-term memory (LSTM) networks, and Newton's second law (e.g., the equation of motion). Model order reduction is essential for handling structural systems with inherent redundancy and enhancing model efficiency. The LSTM network captures temporal dependencies, enabling accurate prediction of time series responses. The equation of motion is manipulated to learn system nonlinearities and confines the solution space within physically interpretable results. These features enable model training with relatively sparse data and offer benefits in terms of accuracy, interpretability, and robustness. Furthermore, a dataset of seismically designed archetype ductile planar steel moment resistant frames under horizontal seismic loading, available in the DesignSafe-CI Database, is considered for evaluation of the proposed method. The resulting metamodel is capable of handling more complex data compared to existing physics-guided LSTM models and outperforms other non-physics data-driven neural networks.

* 34 pages, 12 figures

Via

Access Paper or Ask Questions

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Feb 24, 2024

Wuyang Chen, Jialin Song, Pu Ren, Shashank Subramanian, Dmitriy Morozov, Michael W. Mahoney

Figure 1 for Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Figure 2 for Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Figure 3 for Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Figure 4 for Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Abstract:Recent years have witnessed the promise of coupling machine learning methods and physical domain-specific insight for solving scientific problems based on partial differential equations (PDEs). However, being data-intensive, these methods still require a large amount of PDE data. This reintroduces the need for expensive numerical PDE solutions, partially undermining the original goal of avoiding these expensive simulations. In this work, seeking data efficiency, we design unsupervised pretraining and in-context learning methods for PDE operator learning. To reduce the need for training data with simulated solutions, we pretrain neural operators on unlabeled PDE data using reconstruction-based proxy tasks. To improve out-of-distribution performance, we further assist neural operators in flexibly leveraging in-context learning methods, without incurring extra training costs or designs. Extensive empirical evaluations on a diverse set of PDEs demonstrate that our method is highly data-efficient, more generalizable, and even outperforms conventional vision-pretrained models.

Via

Access Paper or Ask Questions