Alert button
Picture for Francesco Di Giovanni

Francesco Di Giovanni

Alert button

How does over-squashing affect the power of GNNs?

Jun 06, 2023
Francesco Di Giovanni, T. Konstantin Rusch, Michael M. Bronstein, Andreea Deac, Marc Lackenby, Siddhartha Mishra, Petar Veličković

Figure 1 for How does over-squashing affect the power of GNNs?
Figure 2 for How does over-squashing affect the power of GNNs?
Figure 3 for How does over-squashing affect the power of GNNs?
Figure 4 for How does over-squashing affect the power of GNNs?

Graph Neural Networks (GNNs) are the state-of-the-art model for machine learning on graph-structured data. The most popular class of GNNs operate by exchanging information between adjacent nodes, and are known as Message Passing Neural Networks (MPNNs). Given their widespread use, understanding the expressive power of MPNNs is a key question. However, existing results typically consider settings with uninformative node features. In this paper, we provide a rigorous analysis to determine which function classes of node features can be learned by an MPNN of a given capacity. We do so by measuring the level of pairwise interactions between nodes that MPNNs allow for. This measure provides a novel quantitative characterization of the so-called over-squashing effect, which is observed to occur when a large volume of messages is aggregated into fixed-size vectors. Using our measure, we prove that, to guarantee sufficient communication between pairs of nodes, the capacity of the MPNN must be large enough, depending on properties of the input graph structure, such as commute times. For many relevant scenarios, our analysis results in impossibility statements in practice, showing that over-squashing hinders the expressive power of MPNNs. We validate our theoretical findings through extensive controlled experiments and ablation studies.

Viaarxiv icon

DRew: Dynamically Rewired Message Passing with Delay

May 18, 2023
Benjamin Gutteridge, Xiaowen Dong, Michael Bronstein, Francesco Di Giovanni

Figure 1 for DRew: Dynamically Rewired Message Passing with Delay
Figure 2 for DRew: Dynamically Rewired Message Passing with Delay
Figure 3 for DRew: Dynamically Rewired Message Passing with Delay
Figure 4 for DRew: Dynamically Rewired Message Passing with Delay

Message passing neural networks (MPNNs) have been shown to suffer from the phenomenon of over-squashing that causes poor performance for tasks relying on long-range interactions. This can be largely attributed to message passing only occurring locally, over a node's immediate neighbours. Rewiring approaches attempting to make graphs 'more connected', and supposedly better suited to long-range tasks, often lose the inductive bias provided by distance on the graph since they make distant nodes communicate instantly at every layer. In this paper we propose a framework, applicable to any MPNN architecture, that performs a layer-dependent rewiring to ensure gradual densification of the graph. We also propose a delay mechanism that permits skip connections between nodes depending on the layer and their mutual distance. We validate our approach on several long-range tasks and show that it outperforms graph Transformers and multi-hop MPNNs.

* Accepted at ICML 2023; 16 pages 
Viaarxiv icon

Edge Directionality Improves Learning on Heterophilic Graphs

May 17, 2023
Emanuele Rossi, Bertrand Charpentier, Francesco Di Giovanni, Fabrizio Frasca, Stephan Günnemann, Michael Bronstein

Figure 1 for Edge Directionality Improves Learning on Heterophilic Graphs
Figure 2 for Edge Directionality Improves Learning on Heterophilic Graphs
Figure 3 for Edge Directionality Improves Learning on Heterophilic Graphs
Figure 4 for Edge Directionality Improves Learning on Heterophilic Graphs

Graph Neural Networks (GNNs) have become the de-facto standard tool for modeling relational data. However, while many real-world graphs are directed, the majority of today's GNN models discard this information altogether by simply making the graph undirected. The reasons for this are historical: 1) many early variants of spectral GNNs explicitly required undirected graphs, and 2) the first benchmarks on homophilic graphs did not find significant gain from using direction. In this paper, we show that in heterophilic settings, treating the graph as directed increases the effective homophily of the graph, suggesting a potential gain from the correct use of directionality information. To this end, we introduce Directed Graph Neural Network (Dir-GNN), a novel general framework for deep learning on directed graphs. Dir-GNN can be used to extend any Message Passing Neural Network (MPNN) to account for edge directionality information by performing separate aggregations of the incoming and outgoing edges. We prove that Dir-GNN matches the expressivity of the Directed Weisfeiler-Lehman test, exceeding that of conventional MPNNs. In extensive experiments, we validate that while our framework leaves performance unchanged on homophilic datasets, it leads to large gains over base models such as GCN, GAT and GraphSage on heterophilic benchmarks, outperforming much more complex methods and achieving new state-of-the-art results.

Viaarxiv icon

On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology

Feb 06, 2023
Francesco Di Giovanni, Lorenzo Giusti, Federico Barbero, Giulia Luise, Pietro Lio', Michael Bronstein

Figure 1 for On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology
Figure 2 for On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology
Figure 3 for On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology
Figure 4 for On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology

Message Passing Neural Networks (MPNNs) are instances of Graph Neural Networks that leverage the graph to send messages over the edges. This inductive bias leads to a phenomenon known as over-squashing, where a node feature is insensitive to information contained at distant nodes. Despite recent methods introduced to mitigate this issue, an understanding of the causes for over-squashing and of possible solutions are lacking. In this theoretical work, we prove that: (i) Neural network width can mitigate over-squashing, but at the cost of making the whole network more sensitive; (ii) Conversely, depth cannot help mitigate over-squashing: increasing the number of layers leads to over-squashing being dominated by vanishing gradients; (iii) The graph topology plays the greatest role, since over-squashing occurs between nodes at high commute (access) time. Our analysis provides a unified framework to study different recent methods introduced to cope with over-squashing and serves as a justification for a class of methods that fall under `graph rewiring'.

* 24 pages 
Viaarxiv icon

Graph Neural Networks as Gradient Flows

Jun 22, 2022
Francesco Di Giovanni, James Rowbottom, Benjamin P. Chamberlain, Thomas Markovich, Michael M. Bronstein

Figure 1 for Graph Neural Networks as Gradient Flows
Figure 2 for Graph Neural Networks as Gradient Flows
Figure 3 for Graph Neural Networks as Gradient Flows
Figure 4 for Graph Neural Networks as Gradient Flows

Dynamical systems minimizing an energy are ubiquitous in geometry and physics. We propose a gradient flow framework for GNNs where the equations follow the direction of steepest descent of a learnable energy. This approach allows to explain the GNN evolution from a multi-particle perspective as learning attractive and repulsive forces in feature space via the positive and negative eigenvalues of a symmetric "channel-mixing" matrix. We perform spectral analysis of the solutions and conclude that gradient flow graph convolutional models can induce a dynamics dominated by the graph high frequencies which is desirable for heterophilic datasets. We also describe structural constraints on common GNN architectures allowing to interpret them as gradient flows. We perform thorough ablation studies corroborating our theoretical analysis and show competitive performance of simple and lightweight models on real-world homophilic and heterophilic datasets.

* 27 pages 
Viaarxiv icon

Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs

Feb 09, 2022
Cristian Bodnar, Francesco Di Giovanni, Benjamin Paul Chamberlain, Pietro Liò, Michael M. Bronstein

Figure 1 for Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs
Figure 2 for Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs
Figure 3 for Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs
Figure 4 for Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs

Cellular sheaves equip graphs with "geometrical" structure by assigning vector spaces and linear maps to nodes and edges. Graph Neural Networks (GNNs) implicitly assume a graph with a trivial underlying sheaf. This choice is reflected in the structure of the graph Laplacian operator, the properties of the associated diffusion equation, and the characteristics of the convolutional models that discretise this equation. In this paper, we use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs in heterophilic settings and their oversmoothing behaviour. By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands. At the same time, we prove that when the sheaf is non-trivial, discretised parametric diffusion processes have greater control than GNNs over their asymptotic behaviour. On the practical side, we study how sheaves can be learned from data. The resulting sheaf diffusion models have many desirable properties that address the limitations of classical graph diffusion equations (and corresponding GNN models) and obtain state-of-the-art results in heterophilic settings. Overall, our work provides new connections between GNNs and algebraic topology and would be of interest to both fields.

* 23 pages, 8 figures 
Viaarxiv icon

Heterogeneous manifolds for curvature-aware graph embedding

Feb 02, 2022
Francesco Di Giovanni, Giulia Luise, Michael Bronstein

Figure 1 for Heterogeneous manifolds for curvature-aware graph embedding
Figure 2 for Heterogeneous manifolds for curvature-aware graph embedding
Figure 3 for Heterogeneous manifolds for curvature-aware graph embedding
Figure 4 for Heterogeneous manifolds for curvature-aware graph embedding

Graph embeddings, wherein the nodes of the graph are represented by points in a continuous space, are used in a broad range of Graph ML applications. The quality of such embeddings crucially depends on whether the geometry of the space matches that of the graph. Euclidean spaces are often a poor choice for many types of real-world graphs, where hierarchical structure and a power-law degree distribution are linked to negative curvature. In this regard, it has recently been shown that hyperbolic spaces and more general manifolds, such as products of constant-curvature spaces and matrix manifolds, are advantageous to approximately match nodes pairwise distances. However, all these classes of manifolds are homogeneous, implying that the curvature distribution is the same at each point, making them unsuited to match the local curvature (and related structural properties) of the graph. In this paper, we study graph embeddings in a broader class of heterogeneous rotationally-symmetric manifolds. By adding a single extra radial dimension to any given existing homogeneous model, we can both account for heterogeneous curvature distributions on graphs and pairwise distances. We evaluate our approach on reconstruction tasks on synthetic and real datasets and show its potential in better preservation of high-order structures and heterogeneous random graphs generation.

Viaarxiv icon

Understanding over-squashing and bottlenecks on graphs via curvature

Nov 29, 2021
Jake Topping, Francesco Di Giovanni, Benjamin Paul Chamberlain, Xiaowen Dong, Michael M. Bronstein

Figure 1 for Understanding over-squashing and bottlenecks on graphs via curvature
Figure 2 for Understanding over-squashing and bottlenecks on graphs via curvature
Figure 3 for Understanding over-squashing and bottlenecks on graphs via curvature
Figure 4 for Understanding over-squashing and bottlenecks on graphs via curvature

Most graph neural networks (GNNs) use the message passing paradigm, in which node features are propagated on the input graph. Recent works pointed to the distortion of information flowing from distant nodes as a factor limiting the efficiency of message passing for tasks relying on long-distance interactions. This phenomenon, referred to as 'over-squashing', has been heuristically attributed to graph bottlenecks where the number of $k$-hop neighbors grows rapidly with $k$. We provide a precise description of the over-squashing phenomenon in GNNs and analyze how it arises from bottlenecks in the graph. For this purpose, we introduce a new edge-based combinatorial curvature and prove that negatively curved edges are responsible for the over-squashing issue. We also propose and experimentally test a curvature-based graph rewiring method to alleviate the over-squashing.

Viaarxiv icon

Beltrami Flow and Neural Diffusion on Graphs

Oct 18, 2021
Benjamin Paul Chamberlain, James Rowbottom, Davide Eynard, Francesco Di Giovanni, Xiaowen Dong, Michael M Bronstein

Figure 1 for Beltrami Flow and Neural Diffusion on Graphs
Figure 2 for Beltrami Flow and Neural Diffusion on Graphs
Figure 3 for Beltrami Flow and Neural Diffusion on Graphs
Figure 4 for Beltrami Flow and Neural Diffusion on Graphs

We propose a novel class of graph neural networks based on the discretised Beltrami flow, a non-Euclidean diffusion PDE. In our model, node features are supplemented with positional encodings derived from the graph topology and jointly evolved by the Beltrami flow, producing simultaneously continuous feature learning and topology evolution. The resulting model generalises many popular graph neural networks and achieves state-of-the-art results on several benchmarks.

* 21 pages, 5 figures. Proceedings of the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS) 2021 
Viaarxiv icon