Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Emanuele Zappala

Nonlocal operator learning for fMRI encoding and decoding tasks

May 19, 2026

Andreas Kramer, Saugat Acharya, Alice Giola, Emanuele Zappala

Abstract:Functional MRI data exhibit high-dimensional spatiotemporal structure, making both prediction and decoding challenging. In this work, we investigate neural integral-operator-based models for encoding and decoding tasks in fMRI, with particular emphasis on the role of nonlocal spatiotemporal context. We implement a latent neural integral operator framework that performs fixed point iterations in an auxiliary space from which classification and stimuli prediction is performed via a decoder. We evaluate our model on two open-source fMRI datasets. Our experiments examine both decoding of stimuli from fMRI recordings and encoding of fMRI dynamics from stimulus representations. A main focus is the effect of spatiotemporal context: we systematically compare short and long temporal windows, as well as the use of visual cortex vs whole brain recordings, and analyze their influence on performance and latent-space geometry. Across tasks and datasets, larger temporal windows generally improve results and produce more structured learned representations. In decoding experiments, the learned latent space often provides clearer class separation than the raw data. In encoding experiments, although absolute performance remains moderate due to the difficulty of the task, longer temporal windows still yield consistent gains. These findings suggest that neural integral operators provide a promising framework for modeling fMRI dynamics and that broader spatiotemporal context can be beneficial for both prediction and representation learning. More broadly, the results indicate that exploiting distributed nonlocal structure in brain dynamics requires model architectures specifically designed to capture such dependencies.

* 18 pages, 4 figures, 5 tables. Comments are welcome!

Via

Access Paper or Ask Questions

Neural Integral Operators for Inverse problems in Spectroscopy

May 07, 2025

Emanuele Zappala, Alice Giola, Andreas Kramer, Enrico Greco

Abstract:Deep learning has shown high performance on spectroscopic inverse problems when sufficient data is available. However, it is often the case that data in spectroscopy is scarce, and this usually causes severe overfitting problems with deep learning methods. Traditional machine learning methods are viable when datasets are smaller, but the accuracy and applicability of these methods is generally more limited. We introduce a deep learning method for classification of molecular spectra based on learning integral operators via integral equations of the first kind, which results in an algorithm that is less affected by overfitting issues on small datasets, compared to other deep learning models. The problem formulation of the deep learning approach is based on inverse problems, which have traditionally found important applications in spectroscopy. We perform experiments on real world data to showcase our algorithm. It is seen that the model outperforms traditional machine learning approaches such as decision tree and support vector machine, and for small datasets it outperforms other deep learning models. Therefore, our methodology leverages the power of deep learning, still maintaining the performance when the available data is very limited, which is one of the main issues that deep learning faces in spectroscopy, where datasets are often times of small size.

* 13 pages. v2: Link to code repository added. Comments are welcome

Via

Access Paper or Ask Questions

CaLMFlow: Volterra Flow Matching using Causal Language Models

Oct 03, 2024

Sizhuang He, Daniel Levine, Ivan Vrkic, Marco Francesco Bressana, David Zhang, Syed Asad Rizvi, Yangtian Zhang, Emanuele Zappala, David van Dijk

Abstract:We introduce CaLMFlow (Causal Language Models for Flow Matching), a novel framework that casts flow matching as a Volterra integral equation (VIE), leveraging the power of large language models (LLMs) for continuous data generation. CaLMFlow enables the direct application of LLMs to learn complex flows by formulating flow matching as a sequence modeling task, bridging discrete language modeling and continuous generative modeling. Our method implements tokenization across space and time, thereby solving a VIE over these domains. This approach enables efficient handling of high-dimensional data and outperforms ODE solver-dependent methods like conditional flow matching (CFM). We demonstrate CaLMFlow's effectiveness on synthetic and real-world data, including single-cell perturbation response prediction, showcasing its ability to incorporate textual context and generalize to unseen conditions. Our results highlight LLM-driven flow matching as a promising paradigm in generative modeling, offering improved scalability, flexibility, and context-awareness.

* 10 pages, 9 figures, 7 tables

Via

Access Paper or Ask Questions

Intelligence at the Edge of Chaos

Oct 03, 2024

Shiyang Zhang, Aakash Patel, Syed A Rizvi, Nianchen Liu, Sizhuang He, Amin Karbasi, Emanuele Zappala, David van Dijk

Abstract:We explore the emergence of intelligent behavior in artificial systems by investigating how the complexity of rule-based systems influences the capabilities of models trained to predict these rules. Our study focuses on elementary cellular automata (ECA), simple yet powerful one-dimensional systems that generate behaviors ranging from trivial to highly complex. By training distinct Large Language Models (LLMs) on different ECAs, we evaluated the relationship between the complexity of the rules' behavior and the intelligence exhibited by the LLMs, as reflected in their performance on downstream tasks. Our findings reveal that rules with higher complexity lead to models exhibiting greater intelligence, as demonstrated by their performance on reasoning and chess move prediction tasks. Both uniform and periodic systems, and often also highly chaotic systems, resulted in poorer downstream performance, highlighting a sweet spot of complexity conducive to intelligence. We conjecture that intelligence arises from the ability to predict complexity and that creating intelligence may require only exposure to complexity.

* 15 pages,8 Figures

Via

Access Paper or Ask Questions

Leray-Schauder Mappings for Operator Learning

Oct 02, 2024

Emanuele Zappala

Abstract:We present an algorithm for learning operators between Banach spaces, based on the use of Leray-Schauder mappings to learn a finite-dimensional approximation of compact subspaces. We show that the resulting method is a universal approximator of (possibly nonlinear) operators. We demonstrate the efficiency of the approach on two benchmark datasets showing it achieves results comparable to state of the art models.

* 6 pages, 2 figures, 1 table. Comments are welcome!

Via

Access Paper or Ask Questions

Universal Approximation of Operators with Transformers and Neural Integral Operators

Sep 01, 2024

Emanuele Zappala, Maryam Bagherian

Abstract:We study the universal approximation properties of transformers and neural integral operators for operators in Banach spaces. In particular, we show that the transformer architecture is a universal approximator of integral operators between H\"older spaces. Moreover, we show that a generalized version of neural integral operators, based on the Gavurin integral, are universal approximators of arbitrary operators between Banach spaces. Lastly, we show that a modified version of transformer, which uses Leray-Schauder mappings, is a universal approximator of operators between arbitrary Banach spaces.

* 13 pages. Comments are welcome!

Via

Access Paper or Ask Questions

Projection Methods for Operator Learning and Universal Approximation

Jun 18, 2024

Emanuele Zappala

Abstract:We obtain a new universal approximation theorem for continuous operators on arbitrary Banach spaces using the Leray-Schauder mapping. Moreover, we introduce and study a method for operator learning in Banach spaces $L^p$ of functions with multiple variables, based on orthogonal projections on polynomial bases. We derive a universal approximation result for operators where we learn a linear projection and a finite dimensional mapping under some additional assumptions. For the case of $p=2$, we give some sufficient conditions for the approximation results to hold. This article serves as the theoretical framework for a deep learning methodology whose implementation will be provided in subsequent work.

* 10 pages

Via

Access Paper or Ask Questions

Spectral methods for Neural Integral Equations

Dec 09, 2023

Emanuele Zappala

Abstract:Neural integral equations are deep learning models based on the theory of integral equations, where the model consists of an integral operator and the corresponding equation (of the second kind) which is learned through an optimization procedure. This approach allows to leverage the nonlocal properties of integral operators in machine learning, but it is computationally expensive. In this article, we introduce a framework for neural integral equations based on spectral methods that allows us to learn an operator in the spectral domain, resulting in a cheaper computational cost, as well as in high interpolation accuracy. We study the properties of our methods and show various theoretical guarantees regarding the approximation capabilities of the model, and convergence to solutions of the numerical methods. We provide numerical experiments to demonstrate the practical effectiveness of the resulting model.

* 15 pages, 3 figures and 2 tables

Via

Access Paper or Ask Questions

Operator Learning Meets Numerical Analysis: Improving Neural Networks through Iterative Methods

Oct 02, 2023

Emanuele Zappala, Daniel Levine, Sizhuang He, Syed Rizvi, Sacha Levy, David van Dijk

Abstract:Deep neural networks, despite their success in numerous applications, often function without established theoretical foundations. In this paper, we bridge this gap by drawing parallels between deep learning and classical numerical analysis. By framing neural networks as operators with fixed points representing desired solutions, we develop a theoretical framework grounded in iterative methods for operator equations. Under defined conditions, we present convergence proofs based on fixed point theory. We demonstrate that popular architectures, such as diffusion models and AlphaFold, inherently employ iterative operator learning. Empirical assessments highlight that performing iterations through network operators improves performance. We also introduce an iterative graph neural network, PIGN, that further demonstrates benefits of iterations. Our work aims to enhance the understanding of deep learning by merging insights from numerical analysis, potentially guiding the design of future networks with clearer theoretical underpinnings and improved performance.

* 27 pages (13+14). 8 Figures and 5 tables. Comments are welcome!

Via

Access Paper or Ask Questions

Continuous Spatiotemporal Transformers

Jan 31, 2023

Antonio H. de O. Fonseca, Emanuele Zappala, Josue Ortega Caro, David van Dijk

Figure 1 for Continuous Spatiotemporal Transformers

Figure 2 for Continuous Spatiotemporal Transformers

Figure 3 for Continuous Spatiotemporal Transformers

Figure 4 for Continuous Spatiotemporal Transformers

Abstract:Modeling spatiotemporal dynamical systems is a fundamental challenge in machine learning. Transformer models have been very successful in NLP and computer vision where they provide interpretable representations of data. However, a limitation of transformers in modeling continuous dynamical systems is that they are fundamentally discrete time and space models and thus have no guarantees regarding continuous sampling. To address this challenge, we present the Continuous Spatiotemporal Transformer (CST), a new transformer architecture that is designed for the modeling of continuous systems. This new framework guarantees a continuous and smooth output via optimization in Sobolev space. We benchmark CST against traditional transformers as well as other spatiotemporal dynamics modeling methods and achieve superior performance in a number of tasks on synthetic and real systems, including learning brain dynamics from calcium imaging data.

Via

Access Paper or Ask Questions