Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stijn Vansummeren

Halting Recurrent GNNs and the Graded $μ$-Calculus

May 16, 2025

Jeroen Bollen, Jan Van den Bussche, Stijn Vansummeren, Jonni Virtema

Abstract:Graph Neural Networks (GNNs) are a class of machine-learning models that operate on graph-structured data. Their expressive power is intimately related to logics that are invariant under graded bisimilarity. Current proposals for recurrent GNNs either assume that the graph size is given to the model, or suffer from a lack of termination guarantees. In this paper, we propose a halting mechanism for recurrent GNNs. We prove that our halting model can express all node classifiers definable in graded modal mu-calculus, even for the standard GNN variant that is oblivious to the graph size. A recent breakthrough in the study of the expressivity of graded modal mu-calculus in the finite suggests that conversely, restricted to node classifiers definable in monadic second-order logic, recurrent GNNs can express only node classifiers definable in graded modal mu-calculus. To prove our main result, we develop a new approximate semantics for graded mu-calculus, which we believe to be of independent interest. We leverage this new semantics into a new model-checking algorithm, called the counting algorithm, which is oblivious to the graph size. In a final step we show that the counting algorithm can be implemented on a halting recurrent GNN.

Via

Access Paper or Ask Questions

SpannerLib: Embedding Declarative Information Extraction in an Imperative Workflow

Sep 03, 2024

Dean Light, Ahmad Aiashy, Mahmoud Diab, Daniel Nachmias, Stijn Vansummeren, Benny Kimelfeld

Abstract:Document spanners have been proposed as a formal framework for declarative Information Extraction (IE) from text, following IE products from the industry and academia. Over the past decade, the framework has been studied thoroughly in terms of expressive power, complexity, and the ability to naturally combine text analysis with relational querying. This demonstration presents SpannerLib a library for embedding document spanners in Python code. SpannerLib facilitates the development of IE programs by providing an implementation of Spannerlog (Datalog-based documentspanners) that interacts with the Python code in two directions: rules can be embedded inside Python, and they can invoke custom Python code (e.g., calls to ML-based NLP models) via user-defined functions. The demonstration scenarios showcase IE programs, with increasing levels of complexity, within Jupyter Notebook.

* 4 pages

Via

Access Paper or Ask Questions

Schema Matching with Large Language Models: an Experimental Study

Jul 16, 2024

Marcel Parciak, Brecht Vandevoort, Frank Neven, Liesbet M. Peeters, Stijn Vansummeren

Figure 1 for Schema Matching with Large Language Models: an Experimental Study

Figure 2 for Schema Matching with Large Language Models: an Experimental Study

Figure 3 for Schema Matching with Large Language Models: an Experimental Study

Figure 4 for Schema Matching with Large Language Models: an Experimental Study

Abstract:Large Language Models (LLMs) have shown useful applications in a variety of tasks, including data wrangling. In this paper, we investigate the use of an off-the-shelf LLM for schema matching. Our objective is to identify semantic correspondences between elements of two relational schemas using only names and descriptions. Using a newly created benchmark from the health domain, we propose different so-called task scopes. These are methods for prompting the LLM to do schema matching, which vary in the amount of context information contained in the prompt. Using these task scopes we compare LLM-based schema matching against a string similarity baseline, investigating matching quality, verification effort, decisiveness, and complementarity of the approaches. We find that matching quality suffers from a lack of context information, but also from providing too much context information. In general, using newer LLM versions increases decisiveness. We identify task scopes that have acceptable verification effort and succeed in identifying a significant number of true semantic matches. Our study shows that LLMs have potential in bootstrapping the schema matching process and are able to assist data engineers in speeding up this task solely based on schema element names and descriptions without the need for data instances.

* Accepted at the 2nd International Workshop on Tabular Data Analysis (TaDA24), collocated with the 50th International Conference on Very Large Data Bases (VLDB 2024) Guangzhou, China - August 29, 2024

Via

Access Paper or Ask Questions

Learning Graph Neural Networks using Exact Compression

Apr 28, 2023

Jeroen Bollen, Jasper Steegmans, Jan Van den Bussche, Stijn Vansummeren

Figure 1 for Learning Graph Neural Networks using Exact Compression

Figure 2 for Learning Graph Neural Networks using Exact Compression

Figure 3 for Learning Graph Neural Networks using Exact Compression

Figure 4 for Learning Graph Neural Networks using Exact Compression

Abstract:Graph Neural Networks (GNNs) are a form of deep learning that enable a wide range of machine learning applications on graph-structured data. The learning of GNNs, however, is known to pose challenges for memory-constrained devices such as GPUs. In this paper, we study exact compression as a way to reduce the memory requirements of learning GNNs on large graphs. In particular, we adopt a formal approach to compression and propose a methodology that transforms GNN learning problems into provably equivalent compressed GNN learning problems. In a preliminary experimental evaluation, we give insights into the compression ratios that can be obtained on real-world graphs and apply our methodology to an existing GNN benchmark.

* Extended version of the paper to be published in the proceedings of the 6th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), held in conjunction with SIGMOD/PODS 2023

Via

Access Paper or Ask Questions