Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saif Eddin Jabari

Efficient Dilated Squeeze and Excitation Neural Operator for Differential Equations

Jan 24, 2026

Prajwal Chauhan, Salah Eddine Choutri, Saif Eddin Jabari

Abstract:Fast and accurate surrogates for physics-driven partial differential equations (PDEs) are essential in fields such as aerodynamics, porous media design, and flow control. However, many transformer-based models and existing neural operators remain parameter-heavy, resulting in costly training and sluggish deployment. We propose D-SENO (Dilated Squeeze-Excitation Neural Operator), a lightweight operator learning framework for efficiently solving a wide range of PDEs, including airfoil potential flow, Darcy flow in porous media, pipe Poiseuille flow, and incompressible Navier Stokes vortical fields. D-SENO combines dilated convolution (DC) blocks with squeeze-and-excitation (SE) modules to jointly capture wide receptive fields and dynamics alongside channel-wise attention, enabling both accurate and efficient PDE inference. Carefully chosen dilation rates allow the receptive field to focus on critical regions, effectively modeling long-range physical dependencies. Meanwhile, the SE modules adaptively recalibrate feature channels to emphasize dynamically relevant scales. Our model achieves training speed of up to approximately $20\times$ faster than standard transformer-based models and neural operators, while also surpassing (or matching) them in accuracy across multiple PDE benchmarks. Ablation studies show that removing the SE modules leads to a slight drop in performance.

* Accepted to Transactions on Machine Learning Research (TMLR)

Via

Access Paper or Ask Questions

FlowMixer: A Constrained Neural Architecture for Interpretable Spatiotemporal Forecasting

May 22, 2025

Fares B. Mehouachi, Saif Eddin Jabari

Abstract:We introduce FlowMixer, a neural architecture that leverages constrained matrix operations to model structured spatiotemporal patterns. At its core, FlowMixer incorporates non-negative matrix mixing layers within a reversible mapping framework-applying transforms before mixing and their inverses afterward. This shape-preserving design enables a Kronecker-Koopman eigenmode framework that bridges statistical learning with dynamical systems theory, providing interpretable spatiotemporal patterns and facilitating direct algebraic manipulation of prediction horizons without retraining. Extensive experiments across diverse domains demonstrate FlowMixer's robust long-horizon forecasting capabilities while effectively modeling physical phenomena such as chaotic attractors and turbulent flows. These results suggest that architectural constraints can simultaneously enhance predictive performance and mathematical interpretability in neural forecasting systems.

Via

Access Paper or Ask Questions

Koopman Autoencoders Learn Neural Representation Dynamics

May 19, 2025

Nishant Suresh Aswani, Saif Eddin Jabari

Abstract:This paper explores a simple question: can we model the internal transformations of a neural network using dynamical systems theory? We introduce Koopman autoencoders to capture how neural representations evolve through network layers, treating these representations as states in a dynamical system. Our approach learns a surrogate model that predicts how neural representations transform from input to output, with two key advantages. First, by way of lifting the original states via an autoencoder, it operates in a linear space, making editing the dynamics straightforward. Second, it preserves the topologies of the original representations by regularizing the autoencoding objective. We demonstrate that these surrogate models naturally replicate the progressive topological simplification observed in neural networks. As a practical application, we show how our approach enables targeted class unlearning in the Yin-Yang and MNIST classification tasks.

Via

Access Paper or Ask Questions

Catastrophic Overfitting, Entropy Gap and Participation Ratio: A Noiseless $l^p$ Norm Solution for Fast Adversarial Training

May 05, 2025

Fares B. Mehouachi, Saif Eddin Jabari

Abstract:Adversarial training is a cornerstone of robust deep learning, but fast methods like the Fast Gradient Sign Method (FGSM) often suffer from Catastrophic Overfitting (CO), where models become robust to single-step attacks but fail against multi-step variants. While existing solutions rely on noise injection, regularization, or gradient clipping, we propose a novel solution that purely controls the $l^p$ training norm to mitigate CO. Our study is motivated by the empirical observation that CO is more prevalent under the $l^{\infty}$ norm than the $l^2$ norm. Leveraging this insight, we develop a framework for generalized $l^p$ attack as a fixed point problem and craft $l^p$-FGSM attacks to understand the transition mechanics from $l^2$ to $l^{\infty}$. This leads to our core insight: CO emerges when highly concentrated gradients where information localizes in few dimensions interact with aggressive norm constraints. By quantifying gradient concentration through Participation Ratio and entropy measures, we develop an adaptive $l^p$-FGSM that automatically tunes the training norm based on gradient information. Extensive experiments demonstrate that this approach achieves strong robustness without requiring additional regularization or noise injection, providing a novel and theoretically-principled pathway to mitigate the CO problem.

Via

Access Paper or Ask Questions

Neural operators struggle to learn complex PDEs in pedestrian mobility: Hughes model case study

Apr 25, 2025

Prajwal Chauhan, Salah Eddine Choutri, Mohamed Ghattassi, Nader Masmoudi, Saif Eddin Jabari

Abstract:This paper investigates the limitations of neural operators in learning solutions for a Hughes model, a first-order hyperbolic conservation law system for crowd dynamics. The model couples a Fokker-Planck equation representing pedestrian density with a Hamilton-Jacobi-type (eikonal) equation. This Hughes model belongs to the class of nonlinear hyperbolic systems that often exhibit complex solution structures, including shocks and discontinuities. In this study, we assess the performance of three state-of-the-art neural operators (Fourier Neural Operator, Wavelet Neural Operator, and Multiwavelet Neural Operator) in various challenging scenarios. Specifically, we consider (1) discontinuous and Gaussian initial conditions and (2) diverse boundary conditions, while also examining the impact of different numerical schemes. Our results show that these neural operators perform well in easy scenarios with fewer discontinuities in the initial condition, yet they struggle in complex scenarios with multiple initial discontinuities and dynamic boundary conditions, even when trained specifically on such complex samples. The predicted solutions often appear smoother, resulting in a reduction in total variation and a loss of important physical features. This smoothing behavior is similar to issues discussed by Daganzo (1995), where models that introduce artificial diffusion were shown to miss essential features such as shock waves in hyperbolic systems. These results suggest that current neural operator architectures may introduce unintended regularization effects that limit their ability to capture transport dynamics governed by discontinuities. They also raise concerns about generalizing these methods to traffic applications where shock preservation is essential.

* 26 pages, 15 figures, 6 tables, under review at Artificial Intelligence for Transportation | Journal

Via

Access Paper or Ask Questions

Urban traffic analysis and forecasting through shared Koopman eigenmodes

Sep 07, 2024

Chuhan Yang, Fares B. Mehouachi, Monica Menendez, Saif Eddin Jabari

Figure 1 for Urban traffic analysis and forecasting through shared Koopman eigenmodes

Figure 2 for Urban traffic analysis and forecasting through shared Koopman eigenmodes

Figure 3 for Urban traffic analysis and forecasting through shared Koopman eigenmodes

Figure 4 for Urban traffic analysis and forecasting through shared Koopman eigenmodes

Abstract:Predicting traffic flow in data-scarce cities is challenging due to limited historical data. To address this, we leverage transfer learning by identifying periodic patterns common to data-rich cities using a customized variant of Dynamic Mode Decomposition (DMD): constrained Hankelized DMD (TrHDMD). This method uncovers common eigenmodes (urban heartbeats) in traffic patterns and transfers them to data-scarce cities, significantly enhancing prediction performance. TrHDMD reduces the need for extensive training datasets by utilizing prior knowledge from other cities. By applying Koopman operator theory to multi-city loop detector data, we identify stable, interpretable, and time-invariant traffic modes. Injecting ``urban heartbeats'' into forecasting tasks improves prediction accuracy and has the potential to enhance traffic management strategies for cities with varying data infrastructures. Our work introduces cross-city knowledge transfer via shared Koopman eigenmodes, offering actionable insights and reliable forecasts for data-scarce urban environments.

Via

Access Paper or Ask Questions

Representing Neural Network Layers as Linear Operations via Koopman Operator Theory

Sep 02, 2024

Nishant Suresh Aswani, Saif Eddin Jabari, Muhammad Shafique

Abstract:The strong performance of simple neural networks is often attributed to their nonlinear activations. However, a linear view of neural networks makes understanding and controlling networks much more approachable. We draw from a dynamical systems view of neural networks, offering a fresh perspective by using Koopman operator theory and its connections with dynamic mode decomposition (DMD). Together, they offer a framework for linearizing dynamical systems by embedding the system into an appropriate observable space. By reframing a neural network as a dynamical system, we demonstrate that we can replace the nonlinear layer in a pretrained multi-layer perceptron (MLP) with a finite-dimensional linear operator. In addition, we analyze the eigenvalues of DMD and the right singular vectors of SVD, to present evidence that time-delayed coordinates provide a straightforward and highly effective observable space for Koopman theory to linearize a network layer. Consequently, we replace layers of an MLP trained on the Yin-Yang dataset with predictions from a DMD model, achieving a mdoel accuracy of up to 97.3%, compared to the original 98.4%. In addition, we replace layers in an MLP trained on the MNIST dataset, achieving up to 95.8%, compared to the original 97.2% on the test set.

Via

Access Paper or Ask Questions

Fourier neural operator for learning solutions to macroscopic traffic flow models: Application to the forward and inverse problems

Aug 14, 2023

Bilal Thonnam Thodi, Sai Venkata Ramana Ambadipudi, Saif Eddin Jabari

Abstract:Deep learning methods are emerging as popular computational tools for solving forward and inverse problems in traffic flow. In this paper, we study a neural operator framework for learning solutions to nonlinear hyperbolic partial differential equations with applications in macroscopic traffic flow models. In this framework, an operator is trained to map heterogeneous and sparse traffic input data to the complete macroscopic traffic state in a supervised learning setting. We chose a physics-informed Fourier neural operator ($\pi$-FNO) as the operator, where an additional physics loss based on a discrete conservation law regularizes the problem during training to improve the shock predictions. We also propose to use training data generated from random piecewise constant input data to systematically capture the shock and rarefied solutions. From experiments using the LWR traffic flow model, we found superior accuracy in predicting the density dynamics of a ring-road network and urban signalized road. We also found that the operator can be trained using simple traffic density dynamics, e.g., consisting of $2-3$ vehicle queues and $1-2$ traffic signal cycles, and it can predict density dynamics for heterogeneous vehicle queue distributions and multiple traffic signal cycles $(\geq 2)$ with an acceptable error. The extrapolation error grew sub-linearly with input complexity for a proper choice of the model architecture and training data. Adding a physics regularizer aided in learning long-term traffic density dynamics, especially for problems with periodic boundary data.

Via

Access Paper or Ask Questions

Physical Backdoor Trigger Activation of Autonomous Vehicle using Reachability Analysis

Mar 27, 2023

Wenqing Li, Yue Wang, Muhammad Shafique, Saif Eddin Jabari

Figure 1 for Physical Backdoor Trigger Activation of Autonomous Vehicle using Reachability Analysis

Figure 2 for Physical Backdoor Trigger Activation of Autonomous Vehicle using Reachability Analysis

Figure 3 for Physical Backdoor Trigger Activation of Autonomous Vehicle using Reachability Analysis

Figure 4 for Physical Backdoor Trigger Activation of Autonomous Vehicle using Reachability Analysis

Abstract:Recent studies reveal that Autonomous Vehicles (AVs) can be manipulated by hidden backdoors, causing them to perform harmful actions when activated by physical triggers. However, it is still unclear how these triggers can be activated while adhering to traffic principles. Understanding this vulnerability in a dynamic traffic environment is crucial. This work addresses this gap by presenting physical trigger activation as a reachability problem of controlled dynamic system. Our technique identifies security-critical areas in traffic systems where trigger conditions for accidents can be reached, and provides intended trajectories for how those conditions can be reached. Testing on typical traffic scenarios showed the system can be successfully driven to trigger conditions with near 100% activation rate. Our method benefits from identifying AV vulnerability and enabling effective safety strategies.

Via

Access Paper or Ask Questions

Optimal Smoothing Distribution Exploration for Backdoor Neutralization in Deep Learning-based Traffic Systems

Mar 24, 2023

Yue Wang, Wending Li, Michail Maniatakos, Saif Eddin Jabari

Figure 1 for Optimal Smoothing Distribution Exploration for Backdoor Neutralization in Deep Learning-based Traffic Systems

Figure 2 for Optimal Smoothing Distribution Exploration for Backdoor Neutralization in Deep Learning-based Traffic Systems

Figure 3 for Optimal Smoothing Distribution Exploration for Backdoor Neutralization in Deep Learning-based Traffic Systems

Figure 4 for Optimal Smoothing Distribution Exploration for Backdoor Neutralization in Deep Learning-based Traffic Systems

Abstract:Deep Reinforcement Learning (DRL) enhances the efficiency of Autonomous Vehicles (AV), but also makes them susceptible to backdoor attacks that can result in traffic congestion or collisions. Backdoor functionality is typically incorporated by contaminating training datasets with covert malicious data to maintain high precision on genuine inputs while inducing the desired (malicious) outputs for specific inputs chosen by adversaries. Current defenses against backdoors mainly focus on image classification using image-based features, which cannot be readily transferred to the regression task of DRL-based AV controllers since the inputs are continuous sensor data, i.e., the combinations of velocity and distance of AV and its surrounding vehicles. Our proposed method adds well-designed noise to the input to neutralize backdoors. The approach involves learning an optimal smoothing (noise) distribution to preserve the normal functionality of genuine inputs while neutralizing backdoors. By doing so, the resulting model is expected to be more resilient against backdoor attacks while maintaining high accuracy on genuine inputs. The effectiveness of the proposed method is verified on a simulated traffic system based on a microscopic traffic simulator, where experimental results showcase that the smoothed traffic controller can neutralize all trigger samples and maintain the performance of relieving traffic congestion

Via

Access Paper or Ask Questions