Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Davide Bacciu

Dipartimento di Informatica, Università di Pisa

Self-generated Replay Memories for Continual Neural Machine Translation

Mar 19, 2024

Michele Resta, Davide Bacciu

Abstract:Modern Neural Machine Translation systems exhibit strong performance in several different languages and are constantly improving. Their ability to learn continuously is, however, still severely limited by the catastrophic forgetting issue. In this work, we leverage a key property of encoder-decoder Transformers, i.e. their generative ability, to propose a novel approach to continually learning Neural Machine Translation systems. We show how this can effectively learn on a stream of experiences comprising different languages, by leveraging a replay memory populated by using the model itself as a generator of parallel sentences. We empirically demonstrate that our approach can counteract catastrophic forgetting without requiring explicit memorization of training data. Code will be publicly available upon publication. Code: https://github.com/m-resta/sg-rep

* Accepted at NAACL 2024

Via

Access Paper or Ask Questions

Multi-Relational Graph Neural Network for Out-of-Domain Link Prediction

Mar 17, 2024

Asma Sattar, Georgios Deligiorgis, Marco Trincavelli, Davide Bacciu

Figure 1 for Multi-Relational Graph Neural Network for Out-of-Domain Link Prediction

Figure 2 for Multi-Relational Graph Neural Network for Out-of-Domain Link Prediction

Figure 3 for Multi-Relational Graph Neural Network for Out-of-Domain Link Prediction

Figure 4 for Multi-Relational Graph Neural Network for Out-of-Domain Link Prediction

Abstract:Dynamic multi-relational graphs are an expressive relational representation for data enclosing entities and relations of different types, and where relationships are allowed to vary in time. Addressing predictive tasks over such data requires the ability to find structure embeddings that capture the diversity of the relationships involved, as well as their dynamic evolution. In this work, we establish a novel class of challenging tasks for dynamic multi-relational graphs involving out-of-domain link prediction, where the relationship being predicted is not available in the input graph. We then introduce a novel Graph Neural Network model, named GOOD, designed specifically to tackle the out-of-domain generalization problem. GOOD introduces a novel design concept for multi-relation embedding aggregation, based on the idea that good representations are such when it is possible to disentangle the mixing proportions of the different relational embeddings that have produced it. We also propose five benchmarks based on two retail domains, where we show that GOOD can effectively generalize predictions out of known relationship types and achieve state-of-the-art results. Most importantly, we provide insights into problems where out-of-domain prediction might be preferred to an in-domain formulation, that is, where the relationship to be predicted has very few positive examples.

* 8 pages, 3 figures, 3 Tables, conference [accepted in IEEE WCCI 2024]

Via

Access Paper or Ask Questions

Adaptive Hyperparameter Optimization for Continual Learning Scenarios

Mar 09, 2024

Rudy Semola, Julio Hurtado, Vincenzo Lomonaco, Davide Bacciu

Figure 1 for Adaptive Hyperparameter Optimization for Continual Learning Scenarios

Figure 2 for Adaptive Hyperparameter Optimization for Continual Learning Scenarios

Figure 3 for Adaptive Hyperparameter Optimization for Continual Learning Scenarios

Figure 4 for Adaptive Hyperparameter Optimization for Continual Learning Scenarios

Abstract:Hyperparameter selection in continual learning scenarios is a challenging and underexplored aspect, especially in practical non-stationary environments. Traditional approaches, such as grid searches with held-out validation data from all tasks, are unrealistic for building accurate lifelong learning systems. This paper aims to explore the role of hyperparameter selection in continual learning and the necessity of continually and automatically tuning them according to the complexity of the task at hand. Hence, we propose leveraging the nature of sequence task learning to improve Hyperparameter Optimization efficiency. By using the functional analysis of variance-based techniques, we identify the most crucial hyperparameters that have an impact on performance. We demonstrate empirically that this approach, agnostic to continual scenarios and strategies, allows us to speed up hyperparameters optimization continually across tasks and exhibit robustness even in the face of varying sequential task orders. We believe that our findings can contribute to the advancement of continual learning methodologies towards more efficient, robust and adaptable models for real-world applications.

Via

Access Paper or Ask Questions

Awareness in robotics: An early perspective from the viewpoint of the EIC Pathfinder Challenge "Awareness Inside''

Feb 14, 2024

Cosimo Della Santina, Carlos Hernandez Corbato, Burak Sisman, Luis A. Leiva, Ioannis Arapakis, Michalis Vakalellis, Jean Vanderdonckt, Luis Fernando D'Haro, Guido Manzi, Cristina Becchio(+21 more)

Abstract:Consciousness has been historically a heavily debated topic in engineering, science, and philosophy. On the contrary, awareness had less success in raising the interest of scholars in the past. However, things are changing as more and more researchers are getting interested in answering questions concerning what awareness is and how it can be artificially generated. The landscape is rapidly evolving, with multiple voices and interpretations of the concept being conceived and techniques being developed. The goal of this paper is to summarize and discuss the ones among these voices connected with projects funded by the EIC Pathfinder Challenge called ``Awareness Inside'', a nonrecurring call for proposals within Horizon Europe designed specifically for fostering research on natural and synthetic awareness. In this perspective, we dedicate special attention to challenges and promises of applying synthetic awareness in robotics, as the development of mature techniques in this new field is expected to have a special impact on generating more capable and trustworthy embodied systems.

Via

Access Paper or Ask Questions

Classifier-free graph diffusion for molecular property targeting

Dec 28, 2023

Matteo Ninniri, Marco Podda, Davide Bacciu

Figure 1 for Classifier-free graph diffusion for molecular property targeting

Figure 2 for Classifier-free graph diffusion for molecular property targeting

Figure 3 for Classifier-free graph diffusion for molecular property targeting

Figure 4 for Classifier-free graph diffusion for molecular property targeting

Abstract:This work focuses on the task of property targeting: that is, generating molecules conditioned on target chemical properties to expedite candidate screening for novel drug and materials development. DiGress is a recent diffusion model for molecular graphs whose distinctive feature is allowing property targeting through classifier-based (CB) guidance. While CB guidance may work to generate molecular-like graphs, we hint at the fact that its assumptions apply poorly to the chemical domain. Based on this insight we propose a classifier-free DiGress (FreeGress), which works by directly injecting the conditioning information into the training process. CF guidance is convenient given its less stringent assumptions and since it does not require to train an auxiliary property regressor, thus halving the number of trainable parameters in the model. We empirically show that our model yields up to 79% improvement in Mean Absolute Error with respect to DiGress on property targeting tasks on QM9 and ZINC-250k benchmarks. As an additional contribution, we propose a simple yet powerful approach to improve chemical validity of generated samples, based on the observation that certain chemical properties such as molecular weight correlate with the number of atoms in molecules.

* Accepted to GCLR workshop (AAAI '24)

Via

Access Paper or Ask Questions

Neural Autoencoder-Based Structure-Preserving Model Order Reduction and Control Design for High-Dimensional Physical Systems

Dec 11, 2023

Marco Lepri, Davide Bacciu, Cosimo Della Santina

Abstract:This work concerns control-oriented and structure-preserving learning of low-dimensional approximations of high-dimensional physical systems, with a focus on mechanical systems. We investigate the integration of neural autoencoders in model order reduction, while at the same time preserving Hamiltonian or Lagrangian structures. We focus on extensively evaluating the considered methodology by performing simulation and control experiments on large mass-spring-damper networks, with hundreds of states. The empirical findings reveal that compressed latent dynamics with less than 5 degrees of freedom can accurately reconstruct the original systems' transient and steady-state behavior with a relative total error of around 4\%, while simultaneously accurately reconstructing the total energy. Leveraging this system compression technique, we introduce a model-based controller that exploits the mathematical structure of the compressed model to regulate the configuration of heavily underactuated mechanical systems.

* 11 pages, 14 Figures

Via

Access Paper or Ask Questions

Constraint-Free Structure Learning with Smooth Acyclic Orientations

Sep 15, 2023

Riccardo Massidda, Francesco Landolfi, Martina Cinquini, Davide Bacciu

Figure 1 for Constraint-Free Structure Learning with Smooth Acyclic Orientations

Figure 2 for Constraint-Free Structure Learning with Smooth Acyclic Orientations

Figure 3 for Constraint-Free Structure Learning with Smooth Acyclic Orientations

Figure 4 for Constraint-Free Structure Learning with Smooth Acyclic Orientations

Abstract:The structure learning problem consists of fitting data generated by a Directed Acyclic Graph (DAG) to correctly reconstruct its arcs. In this context, differentiable approaches constrain or regularize the optimization problem using a continuous relaxation of the acyclicity property. The computational cost of evaluating graph acyclicity is cubic on the number of nodes and significantly affects scalability. In this paper we introduce COSMO, a constraint-free continuous optimization scheme for acyclic structure learning. At the core of our method, we define a differentiable approximation of an orientation matrix parameterized by a single priority vector. Differently from previous work, our parameterization fits a smooth orientation matrix and the resulting acyclic adjacency matrix without evaluating acyclicity at any step. Despite the absence of explicit constraints, we prove that COSMO always converges to an acyclic solution. In addition to being asymptotically faster, our empirical analysis highlights how COSMO performance on graph reconstruction compares favorably with competing structure learning methods.

Via

Access Paper or Ask Questions

Modeling Edge Features with Deep Bayesian Graph Networks

Aug 17, 2023

Daniele Atzeni, Federico Errica, Davide Bacciu, Alessio Micheli

Figure 1 for Modeling Edge Features with Deep Bayesian Graph Networks

Figure 2 for Modeling Edge Features with Deep Bayesian Graph Networks

Figure 3 for Modeling Edge Features with Deep Bayesian Graph Networks

Figure 4 for Modeling Edge Features with Deep Bayesian Graph Networks

Abstract:We propose an extension of the Contextual Graph Markov Model, a deep and probabilistic machine learning model for graphs, to model the distribution of edge features. Our approach is architectural, as we introduce an additional Bayesian network mapping edge features into discrete states to be used by the original model. In doing so, we are also able to build richer graph representations even in the absence of edge features, which is confirmed by the performance improvements on standard graph classification benchmarks. Moreover, we successfully test our proposal in a graph regression scenario where edge features are of fundamental importance, and we show that the learned edge representation provides substantial performance improvements against the original model on three link prediction tasks. By keeping the computational complexity linear in the number of edges, the proposed model is amenable to large-scale graph processing.

* Releasing pre-print version to comply with TAILOR project requirements

Via

Access Paper or Ask Questions

Graph-based Polyphonic Multitrack Music Generation

Jul 27, 2023

Emanuele Cosenza, Andrea Valenti, Davide Bacciu

Figure 1 for Graph-based Polyphonic Multitrack Music Generation

Figure 2 for Graph-based Polyphonic Multitrack Music Generation

Figure 3 for Graph-based Polyphonic Multitrack Music Generation

Figure 4 for Graph-based Polyphonic Multitrack Music Generation

Abstract:Graphs can be leveraged to model polyphonic multitrack symbolic music, where notes, chords and entire sections may be linked at different levels of the musical hierarchy by tonal and rhythmic relationships. Nonetheless, there is a lack of works that consider graph representations in the context of deep learning systems for music generation. This paper bridges this gap by introducing a novel graph representation for music and a deep Variational Autoencoder that generates the structure and the content of musical graphs separately, one after the other, with a hierarchical architecture that matches the structural priors of music. By separating the structure and content of musical graphs, it is possible to condition generation by specifying which instruments are played at certain times. This opens the door to a new form of human-computer interaction in the context of music co-creation. After training the model on existing MIDI datasets, the experiments show that the model is able to generate appealing short and long musical sequences and to realistically interpolate between them, producing music that is tonally and rhythmically consistent. Finally, the visualization of the embeddings shows that the model is able to organize its latent space in accordance with known musical concepts.

Via

Access Paper or Ask Questions

Deep learning for dynamic graphs: models and benchmarks

Jul 12, 2023

Alessio Gravina, Davide Bacciu

Abstract:Recent progress in research on Deep Graph Networks (DGNs) has led to a maturation of the domain of learning on graphs. Despite the growth of this research field, there are still important challenges that are yet unsolved. Specifically, there is an urge of making DGNs suitable for predictive tasks on realworld systems of interconnected entities, which evolve over time. With the aim of fostering research in the domain of dynamic graphs, at first, we survey recent advantages in learning both temporal and spatial information, providing a comprehensive overview of the current state-of-the-art in the domain of representation learning for dynamic graphs. Secondly, we conduct a fair performance comparison among the most popular proposed approaches, leveraging rigorous model selection and assessment for all the methods, thus establishing a sound baseline for evaluating new architectures and approaches

Via

Access Paper or Ask Questions