Alert button
Picture for Pietro Michiardi

Pietro Michiardi

Alert button

Multi-modal Latent Diffusion

Jun 07, 2023
Mustapha Bounoua, Giulio Franzese, Pietro Michiardi

Figure 1 for Multi-modal Latent Diffusion
Figure 2 for Multi-modal Latent Diffusion
Figure 3 for Multi-modal Latent Diffusion
Figure 4 for Multi-modal Latent Diffusion

Multi-modal data-sets are ubiquitous in modern applications, and multi-modal Variational Autoencoders are a popular family of models that aim to learn a joint representation of the different modalities. However, existing approaches suffer from a coherence-quality tradeoff, where models with good generation quality lack generative coherence across modalities, and vice versa. We discuss the limitations underlying the unsatisfactory performance of existing methods, to motivate the need for a different approach. We propose a novel method that uses a set of independently trained, uni-modal, deterministic autoencoders. Individual latent variables are concatenated into a common latent space, which is fed to a masked diffusion model to enable generative modeling. We also introduce a new multi-time training method to learn the conditional score network for multi-modal diffusion. Our methodology substantially outperforms competitors in both generation quality and coherence, as shown through an extensive experimental campaign.

Viaarxiv icon

One-Line-of-Code Data Mollification Improves Optimization of Likelihood-based Generative Models

May 30, 2023
Ba-Hien Tran, Giulio Franzese, Pietro Michiardi, Maurizio Filippone

Figure 1 for One-Line-of-Code Data Mollification Improves Optimization of Likelihood-based Generative Models
Figure 2 for One-Line-of-Code Data Mollification Improves Optimization of Likelihood-based Generative Models
Figure 3 for One-Line-of-Code Data Mollification Improves Optimization of Likelihood-based Generative Models
Figure 4 for One-Line-of-Code Data Mollification Improves Optimization of Likelihood-based Generative Models

Generative Models (GMs) have attracted considerable attention due to their tremendous success in various domains, such as computer vision where they are capable to generate impressive realistic-looking images. Likelihood-based GMs are attractive due to the possibility to generate new data by a single model evaluation. However, they typically achieve lower sample quality compared to state-of-the-art score-based diffusion models (DMs). This paper provides a significant step in the direction of addressing this limitation. The idea is to borrow one of the strengths of score-based DMs, which is the ability to perform accurate density estimation in low-density regions and to address manifold overfitting by means of data mollification. We connect data mollification through the addition of Gaussian noise to Gaussian homotopy, which is a well-known technique to improve optimization. Data mollification can be implemented by adding one line of code in the optimization loop, and we demonstrate that this provides a boost in generation quality of likelihood-based GMs, without computational overheads. We report results on image data sets with popular likelihood-based GMs, including variants of variational autoencoders and normalizing flows, showing large improvements in FID score.

Viaarxiv icon

Continuous-Time Functional Diffusion Processes

Mar 01, 2023
Giulio Franzese, Simone Rossi, Dario Rossi, Markus Heinonen, Maurizio Filippone, Pietro Michiardi

Figure 1 for Continuous-Time Functional Diffusion Processes
Figure 2 for Continuous-Time Functional Diffusion Processes

We introduce functional diffusion processes (FDPs), which generalize traditional score-based diffusion models to infinite-dimensional function spaces. FDPs require a new mathematical framework to describe the forward and backward dynamics, and several extensions to derive practical training objectives. These include infinite-dimensional versions of the Girsanov theorem, in order to be able to compute an ELBO, and of the sampling theorem, in order to guarantee that functional evaluations in a countable set of points are equivalent to infinite-dimensional functions. We use FDPs to build a new breed of generative models in function spaces, which do not require specialized network architectures, and that can work with any kind of continuous data. Our results on synthetic and real data illustrate the advantages of FDPs in simplifying the design requirements of diffusion models.

Viaarxiv icon

"It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning

Jan 07, 2023
Raphael Azorin, Massimo Gallo, Alessandro Finamore, Dario Rossi, Pietro Michiardi

Figure 1 for "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning
Figure 2 for "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning
Figure 3 for "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning
Figure 4 for "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint Learning

While the promises of Multi-Task Learning (MTL) are attractive, characterizing the conditions of its success is still an open problem in Deep Learning. Some tasks may benefit from being learned together while others may be detrimental to one another. From a task perspective, grouping cooperative tasks while separating competing tasks is paramount to reap the benefits of MTL, i.e., reducing training and inference costs. Therefore, estimating task affinity for joint learning is a key endeavor. Recent work suggests that the training conditions themselves have a significant impact on the outcomes of MTL. Yet, the literature is lacking of a benchmark to assess the effectiveness of tasks affinity estimation techniques and their relation with actual MTL performance. In this paper, we take a first step in recovering this gap by (i) defining a set of affinity scores by both revisiting contributions from previous literature as well presenting new ones and (ii) benchmarking them on the Taskonomy dataset. Our empirical campaign reveals how, even in a small-scale scenario, task affinity scoring does not correlate well with actual MTL performance. Yet, some metrics can be more indicative than others.

* 7 pages. AAAI'23 - 2nd International Workshop on Practical Deep Learning in the Wild 
Viaarxiv icon

How Much is Enough? A Study on Diffusion Times in Score-based Generative Models

Jun 10, 2022
Giulio Franzese, Simone Rossi, Lixuan Yang, Alessandro Finamore, Dario Rossi, Maurizio Filippone, Pietro Michiardi

Figure 1 for How Much is Enough? A Study on Diffusion Times in Score-based Generative Models
Figure 2 for How Much is Enough? A Study on Diffusion Times in Score-based Generative Models
Figure 3 for How Much is Enough? A Study on Diffusion Times in Score-based Generative Models
Figure 4 for How Much is Enough? A Study on Diffusion Times in Score-based Generative Models

Score-based diffusion models are a class of generative models whose dynamics is described by stochastic differential equations that map noise into data. While recent works have started to lay down a theoretical foundation for these models, an analytical understanding of the role of the diffusion time T is still lacking. Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution; however, a smaller value of T should be preferred for a better approximation of the score-matching objective and higher computational efficiency. Starting from a variational interpretation of diffusion models, in this work we quantify this trade-off, and suggest a new method to improve quality and efficiency of both training and sampling, by adopting smaller diffusion times. Indeed, we show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process. Empirical results support our analysis; for image data, our method is competitive w.r.t. the state-of-the-art, according to standard sample quality metrics and log-likelihood.

Viaarxiv icon

Safer Autonomous Driving in a Stochastic, Partially-Observable Environment by Hierarchical Contingency Planning

Apr 13, 2022
Ugo Lecerf, Christelle Yemdji-Tchassi, Pietro Michiardi

Figure 1 for Safer Autonomous Driving in a Stochastic, Partially-Observable Environment by Hierarchical Contingency Planning
Figure 2 for Safer Autonomous Driving in a Stochastic, Partially-Observable Environment by Hierarchical Contingency Planning
Figure 3 for Safer Autonomous Driving in a Stochastic, Partially-Observable Environment by Hierarchical Contingency Planning
Figure 4 for Safer Autonomous Driving in a Stochastic, Partially-Observable Environment by Hierarchical Contingency Planning

When learning to act in a stochastic, partially observable environment, an intelligent agent should be prepared to anticipate a change in its belief of the environment state, and be capable of adapting its actions on-the-fly to changing conditions. As humans, we are able to form contingency plans when learning a task with the explicit aim of being able to correct errors in the initial control, and hence prove useful if ever there is a sudden change in our perception of the environment which requires immediate corrective action. This is especially the case for autonomous vehicles (AVs) navigating real-world situations where safety is paramount, and a strong ability to react to a changing belief about the environment is truly needed. In this paper we explore an end-to-end approach, from training to execution, for learning robust contingency plans and combining them with a hierarchical planner to obtain a robust agent policy in an autonomous navigation task where other vehicles' behaviours are unknown, and the agent's belief about these behaviours is subject to sudden, last-second change. We show that our approach results in robust, safe behaviour in a partially observable, stochastic environment, generalizing well over environment dynamics not seen during training.

* To appear in Generalizable Policy Learning in the Physical World workshop (ICLR 2022) 
Viaarxiv icon

Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios

Apr 11, 2022
Ugo Lecerf, Christelle Yemdji-Tchassi, Sébastien Aubert, Pietro Michiardi

Figure 1 for Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios
Figure 2 for Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios
Figure 3 for Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios
Figure 4 for Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios

When learning to behave in a stochastic environment where safety is critical, such as driving a vehicle in traffic, it is natural for human drivers to plan fallback strategies as a backup to use if ever there is an unexpected change in the environment. Knowing to expect the unexpected, and planning for such outcomes, increases our capability for being robust to unseen scenarios and may help prevent catastrophic failures. Control of Autonomous Vehicles (AVs) has a particular interest in knowing when and how to use fallback strategies in the interest of safety. Due to imperfect information available to an AV about its environment, it is important to have alternate strategies at the ready which might not have been deduced from the original training data distribution. In this paper we present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment. We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy. We base this reward term on a distance metric between the trajectories of agents, in order to force policies to focus on different areas of state-space than the initial exploring agent. Throughout the paper, we refer to this particular training paradigm as learning fallback strategies. We apply this method to an autonomous driving scenario, and show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.

* To appear in proceedings of International Conference on Machine Learning Technologies (ICMLT) 2022 
Viaarxiv icon

Do Deep Neural Networks Contribute to Multivariate Time Series Anomaly Detection?

Apr 04, 2022
Julien Audibert, Pietro Michiardi, Frédéric Guyard, Sébastien Marti, Maria A. Zuluaga

Figure 1 for Do Deep Neural Networks Contribute to Multivariate Time Series Anomaly Detection?
Figure 2 for Do Deep Neural Networks Contribute to Multivariate Time Series Anomaly Detection?
Figure 3 for Do Deep Neural Networks Contribute to Multivariate Time Series Anomaly Detection?
Figure 4 for Do Deep Neural Networks Contribute to Multivariate Time Series Anomaly Detection?

Anomaly detection in time series is a complex task that has been widely studied. In recent years, the ability of unsupervised anomaly detection algorithms has received much attention. This trend has led researchers to compare only learning-based methods in their articles, abandoning some more conventional approaches. As a result, the community in this field has been encouraged to propose increasingly complex learning-based models mainly based on deep neural networks. To our knowledge, there are no comparative studies between conventional, machine learning-based and, deep neural network methods for the detection of anomalies in multivariate time series. In this work, we study the anomaly detection performance of sixteen conventional, machine learning-based and, deep neural network approaches on five real-world open datasets. By analyzing and comparing the performance of each of the sixteen methods, we show that no family of methods outperforms the others. Therefore, we encourage the community to reincorporate the three categories of methods in the anomaly detection in multivariate time series benchmarks.

Viaarxiv icon

Optimization Strategies in Multi-Task Learning: Averaged or Independent Losses?

Oct 04, 2021
Lucas Pascal, Pietro Michiardi, Xavier Bost, Benoit Huet, Maria A. Zuluaga

Figure 1 for Optimization Strategies in Multi-Task Learning: Averaged or Independent Losses?
Figure 2 for Optimization Strategies in Multi-Task Learning: Averaged or Independent Losses?
Figure 3 for Optimization Strategies in Multi-Task Learning: Averaged or Independent Losses?
Figure 4 for Optimization Strategies in Multi-Task Learning: Averaged or Independent Losses?

In Multi-Task Learning (MTL), it is a common practice to train multi-task networks by optimizing an objective function, which is a weighted average of the task-specific objective functions. Although the computational advantages of this strategy are clear, the complexity of the resulting loss landscape has not been studied in the literature. Arguably, its optimization may be more difficult than a separate optimization of the constituting task-specific objectives. In this work, we investigate the benefits of such an alternative, by alternating independent gradient descent steps on the different task-specific objective functions and we formulate a novel way to combine this approach with state-of-the-art optimizers. As the separation of task-specific objectives comes at the cost of increased computational time, we propose a random task grouping as a trade-off between better optimization and computational efficiency. Experimental results over three well-known visual MTL datasets show better overall absolute performance on losses and standard metrics compared to an averaged objective function and other state-of-the-art MTL methods. In particular, our method shows the most benefits when dealing with tasks of different nature and it enables a wider exploration of the shared parameter space. We also show that our random grouping strategy allows to trade-off between these benefits and computational efficiency.

Viaarxiv icon

Optimization Strategies in Multi-Task Learning: Averaged or Separated Losses?

Sep 21, 2021
Lucas Pascal, Pietro Michiardi, Xavier Bost, Benoit Huet, Maria A. Zuluaga

Figure 1 for Optimization Strategies in Multi-Task Learning: Averaged or Separated Losses?
Figure 2 for Optimization Strategies in Multi-Task Learning: Averaged or Separated Losses?
Figure 3 for Optimization Strategies in Multi-Task Learning: Averaged or Separated Losses?
Figure 4 for Optimization Strategies in Multi-Task Learning: Averaged or Separated Losses?

In Multi-Task Learning (MTL), it is a common practice to train multi-task networks by optimizing an objective function, which is a weighted average of the task-specific objective functions. Although the computational advantages of this strategy are clear, the complexity of the resulting loss landscape has not been studied in the literature. Arguably, its optimization may be more difficult than a separate optimization of the constituting task-specific objectives. In this work, we investigate the benefits of such an alternative, by alternating independent gradient descent steps on the different task-specific objective functions and we formulate a novel way to combine this approach with state-of-the-art optimizers. As the separation of task-specific objectives comes at the cost of increased computational time, we propose a random task grouping as a trade-off between better optimization and computational efficiency. Experimental results over three well-known visual MTL datasets show better overall absolute performance on losses and standard metrics compared to an averaged objective function and other state-of-the-art MTL methods. In particular, our method shows the most benefits when dealing with tasks of different nature and it enables a wider exploration of the shared parameter space. We also show that our random grouping strategy allows to trade-off between these benefits and computational efficiency.

Viaarxiv icon