Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Davide Bacciu

Dipartimento di Informatica, Università di Pisa

Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts

Mar 20, 2025

Andrea Pugnana, Riccardo Massidda, Francesco Giannini, Pietro Barbiero, Mateo Espinosa Zarlenga, Roberto Pellungrini, Gabriele Dominici, Fosca Giannotti, Davide Bacciu

Abstract:Concept Bottleneck Models (CBMs) are machine learning models that improve interpretability by grounding their predictions on human-understandable concepts, allowing for targeted interventions in their decision-making process. However, when intervened on, CBMs assume the availability of humans that can identify the need to intervene and always provide correct interventions. Both assumptions are unrealistic and impractical, considering labor costs and human error-proneness. In contrast, Learning to Defer (L2D) extends supervised learning by allowing machine learning models to identify cases where a human is more likely to be correct than the model, thus leading to deferring systems with improved performance. In this work, we gain inspiration from L2D and propose Deferring CBMs (DCBMs), a novel framework that allows CBMs to learn when an intervention is needed. To this end, we model DCBMs as a composition of deferring systems and derive a consistent L2D loss to train them. Moreover, by relying on a CBM architecture, DCBMs can explain why defer occurs on the final task. Our results show that DCBMs achieve high predictive performance and interpretability at the cost of deferring more to humans.

Via

Access Paper or Ask Questions

Towards Efficient Molecular Property Optimization with Graph Energy Based Models

Feb 17, 2025

Luca Miglior, Lorenzo Simone, Marco Podda, Davide Bacciu

Figure 1 for Towards Efficient Molecular Property Optimization with Graph Energy Based Models

Figure 2 for Towards Efficient Molecular Property Optimization with Graph Energy Based Models

Figure 3 for Towards Efficient Molecular Property Optimization with Graph Energy Based Models

Figure 4 for Towards Efficient Molecular Property Optimization with Graph Energy Based Models

Abstract:Optimizing chemical properties is a challenging task due to the vastness and complexity of chemical space. Here, we present a generative energy-based architecture for implicit chemical property optimization, designed to efficiently generate molecules that satisfy target properties without explicit conditional generation. We use Graph Energy Based Models and a training approach that does not require property labels. We validated our approach on well-established chemical benchmarks, showing superior results to state-of-the-art methods and demonstrating robustness and efficiency towards de novo drug design.

* Accepted at ESANN 2025

Via

Access Paper or Ask Questions

I Know How: Combining Prior Policies to Solve New Tasks

Jun 14, 2024

Malio Li, Elia Piccoli, Vincenzo Lomonaco, Davide Bacciu

Abstract:Multi-Task Reinforcement Learning aims at developing agents that are able to continually evolve and adapt to new scenarios. However, this goal is challenging to achieve due to the phenomenon of catastrophic forgetting and the high demand of computational resources. Learning from scratch for each new task is not a viable or sustainable option, and thus agents should be able to collect and exploit prior knowledge while facing new problems. While several methodologies have attempted to address the problem from different perspectives, they lack a common structure. In this work, we propose a new framework, I Know How (IKH), which provides a common formalization. Our methodology focuses on modularity and compositionality of knowledge in order to achieve and enhance agent's ability to learn and adapt efficiently to dynamic environments. To support our framework definition, we present a simple application of it in a simulated driving environment and compare its performance with that of state-of-the-art approaches.

* 7 pages, Conference on Games (CoG) 2024

Via

Access Paper or Ask Questions

Long Range Propagation on Continuous-Time Dynamic Graphs

Jun 04, 2024

Alessio Gravina, Giulio Lovisotto, Claudio Gallicchio, Davide Bacciu, Claas Grohnfeldt

Figure 1 for Long Range Propagation on Continuous-Time Dynamic Graphs

Figure 2 for Long Range Propagation on Continuous-Time Dynamic Graphs

Figure 3 for Long Range Propagation on Continuous-Time Dynamic Graphs

Figure 4 for Long Range Propagation on Continuous-Time Dynamic Graphs

Abstract:Learning Continuous-Time Dynamic Graphs (C-TDGs) requires accurately modeling spatio-temporal information on streams of irregularly sampled events. While many methods have been proposed recently, we find that most message passing-, recurrent- or self-attention-based methods perform poorly on long-range tasks. These tasks require correlating information that occurred "far" away from the current event, either spatially (higher-order node information) or along the time dimension (events occurred in the past). To address long-range dependencies, we introduce Continuous-Time Graph Anti-Symmetric Network (CTAN). Grounded within the ordinary differential equations framework, our method is designed for efficient propagation of information. In this paper, we show how CTAN's (i) long-range modeling capabilities are substantiated by theoretical findings and how (ii) its empirical performance on synthetic long-range benchmarks and real-world benchmarks is superior to other methods. Our results motivate CTAN's ability to propagate long-range information in C-TDGs as well as the inclusion of long-range tasks as part of temporal graph models evaluation.

* Accepted at ICML 2024 (https://openreview.net/forum?id=gVg8V9isul)

Via

Access Paper or Ask Questions

Learning Causal Abstractions of Linear Structural Causal Models

Jun 01, 2024

Riccardo Massidda, Sara Magliacane, Davide Bacciu

Figure 1 for Learning Causal Abstractions of Linear Structural Causal Models

Figure 2 for Learning Causal Abstractions of Linear Structural Causal Models

Figure 3 for Learning Causal Abstractions of Linear Structural Causal Models

Figure 4 for Learning Causal Abstractions of Linear Structural Causal Models

Abstract:The need for modelling causal knowledge at different levels of granularity arises in several settings. Causal Abstraction provides a framework for formalizing this problem by relating two Structural Causal Models at different levels of detail. Despite increasing interest in applying causal abstraction, e.g. in the interpretability of large machine learning models, the graphical and parametrical conditions under which a causal model can abstract another are not known. Furthermore, learning causal abstractions from data is still an open problem. In this work, we tackle both issues for linear causal models with linear abstraction functions. First, we characterize how the low-level coefficients and the abstraction function determine the high-level coefficients and how the high-level model constrains the causal ordering of low-level variables. Then, we apply our theoretical results to learn high-level and low-level causal models and their abstraction function from observational data. In particular, we introduce Abs-LiNGAM, a method that leverages the constraints induced by the learned high-level model and the abstraction function to speedup the recovery of the larger low-level model, under the assumption of non-Gaussian noise terms. In simulated settings, we show the effectiveness of learning causal abstractions from data and the potential of our method in improving scalability of causal discovery.

Via

Access Paper or Ask Questions

Injecting Hamiltonian Architectural Bias into Deep Graph Networks for Long-Range Propagation

May 27, 2024

Simon Heilig, Alessio Gravina, Alessandro Trenta, Claudio Gallicchio, Davide Bacciu

Figure 1 for Injecting Hamiltonian Architectural Bias into Deep Graph Networks for Long-Range Propagation

Figure 2 for Injecting Hamiltonian Architectural Bias into Deep Graph Networks for Long-Range Propagation

Figure 3 for Injecting Hamiltonian Architectural Bias into Deep Graph Networks for Long-Range Propagation

Figure 4 for Injecting Hamiltonian Architectural Bias into Deep Graph Networks for Long-Range Propagation

Abstract:The dynamics of information diffusion within graphs is a critical open issue that heavily influences graph representation learning, especially when considering long-range propagation. This calls for principled approaches that control and regulate the degree of propagation and dissipation of information throughout the neural flow. Motivated by this, we introduce (port-)Hamiltonian Deep Graph Networks, a novel framework that models neural information flow in graphs by building on the laws of conservation of Hamiltonian dynamical systems. We reconcile under a single theoretical and practical framework both non-dissipative long-range propagation and non-conservative behaviors, introducing tools from mechanical systems to gauge the equilibrium between the two components. Our approach can be applied to general message-passing architectures, and it provides theoretical guarantees on information conservation in time. Empirical results prove the effectiveness of our port-Hamiltonian scheme in pushing simple graph convolutional architectures to state-of-the-art performance in long-range benchmarks.

Via

Access Paper or Ask Questions

Tackling Graph Oversquashing by Global and Local Non-Dissipativity

May 02, 2024

Alessio Gravina, Moshe Eliasof, Claudio Gallicchio, Davide Bacciu, Carola-Bibiane Schönlieb

Figure 1 for Tackling Graph Oversquashing by Global and Local Non-Dissipativity

Figure 2 for Tackling Graph Oversquashing by Global and Local Non-Dissipativity

Figure 3 for Tackling Graph Oversquashing by Global and Local Non-Dissipativity

Figure 4 for Tackling Graph Oversquashing by Global and Local Non-Dissipativity

Abstract:A common problem in Message-Passing Neural Networks is oversquashing -- the limited ability to facilitate effective information flow between distant nodes. Oversquashing is attributed to the exponential decay in information transmission as node distances increase. This paper introduces a novel perspective to address oversquashing, leveraging properties of global and local non-dissipativity, that enable the maintenance of a constant information flow rate. Namely, we present SWAN, a uniquely parameterized model GNN with antisymmetry both in space and weight domains, as a means to obtain non-dissipativity. Our theoretical analysis asserts that by achieving these properties, SWAN offers an enhanced ability to transmit information over extended distances. Empirical evaluations on synthetic and real-world benchmarks that emphasize long-range interactions validate the theoretical understanding of SWAN, and its ability to mitigate oversquashing.

Via

Access Paper or Ask Questions

Temporal Graph ODEs for Irregularly-Sampled Time Series

Apr 30, 2024

Alessio Gravina, Daniele Zambon, Davide Bacciu, Cesare Alippi

Figure 1 for Temporal Graph ODEs for Irregularly-Sampled Time Series

Figure 2 for Temporal Graph ODEs for Irregularly-Sampled Time Series

Figure 3 for Temporal Graph ODEs for Irregularly-Sampled Time Series

Figure 4 for Temporal Graph ODEs for Irregularly-Sampled Time Series

Abstract:Modern graph representation learning works mostly under the assumption of dealing with regularly sampled temporal graph snapshots, which is far from realistic, e.g., social networks and physical systems are characterized by continuous dynamics and sporadic observations. To address this limitation, we introduce the Temporal Graph Ordinary Differential Equation (TG-ODE) framework, which learns both the temporal and spatial dynamics from graph streams where the intervals between observations are not regularly spaced. We empirically validate the proposed approach on several graph benchmarks, showing that TG-ODE can achieve state-of-the-art performance in irregular graph stream tasks.

* Preprint. Accepted at IJCAI 2024

Via

Access Paper or Ask Questions

MultiSTOP: Solving Functional Equations with Reinforcement Learning

Apr 23, 2024

Alessandro Trenta, Davide Bacciu, Andrea Cossu, Pietro Ferrero

Abstract:We develop MultiSTOP, a Reinforcement Learning framework for solving functional equations in physics. This new methodology produces actual numerical solutions instead of bounds on them. We extend the original BootSTOP algorithm by adding multiple constraints derived from domain-specific knowledge, even in integral form, to improve the accuracy of the solution. We investigate a particular equation in a one-dimensional Conformal Field Theory.

* ICLR 2024 Workshop on AI4DifferentialEquations In Science

Via

Access Paper or Ask Questions

Calibration of Continual Learning Models

Apr 12, 2024

Lanpei Li, Elia Piccoli, Andrea Cossu, Davide Bacciu, Vincenzo Lomonaco

Figure 1 for Calibration of Continual Learning Models

Figure 2 for Calibration of Continual Learning Models

Figure 3 for Calibration of Continual Learning Models

Figure 4 for Calibration of Continual Learning Models

Abstract:Continual Learning (CL) focuses on maximizing the predictive performance of a model across a non-stationary stream of data. Unfortunately, CL models tend to forget previous knowledge, thus often underperforming when compared with an offline model trained jointly on the entire data stream. Given that any CL model will eventually make mistakes, it is of crucial importance to build calibrated CL models: models that can reliably tell their confidence when making a prediction. Model calibration is an active research topic in machine learning, yet to be properly investigated in CL. We provide the first empirical study of the behavior of calibration approaches in CL, showing that CL strategies do not inherently learn calibrated models. To mitigate this issue, we design a continual calibration approach that improves the performance of post-processing calibration methods over a wide range of different benchmarks and CL strategies. CL does not necessarily need perfect predictive models, but rather it can benefit from reliable predictive models. We believe our study on continual calibration represents a first step towards this direction.

* Accepted at CLVISION workshop, CVPR 2024

Via

Access Paper or Ask Questions