Contrast Enhanced Spectral Mammography (CESM) is a dual-energy mammographic imaging technique that first needs intravenously administration of an iodinated contrast medium; then, it collects both a low-energy image, comparable to standard mammography, and a high-energy image. The two scans are then combined to get a recombined image showing contrast enhancement. Despite CESM diagnostic advantages for breast cancer diagnosis, the use of contrast medium can cause side effects, and CESM also beams patients with a higher radiation dose compared to standard mammography. To address these limitations this work proposes to use deep generative models for virtual contrast enhancement on CESM, aiming to make the CESM contrast-free as well as to reduce the radiation dose. Our deep networks, consisting of an autoencoder and two Generative Adversarial Networks, the Pix2Pix, and the CycleGAN, generate synthetic recombined images solely from low-energy images. We perform an extensive quantitative and qualitative analysis of the model's performance, also exploiting radiologists' assessments, on a novel CESM dataset that includes 1138 images that, as a further contribution of this work, we make publicly available. The results show that CycleGAN is the most promising deep network to generate synthetic recombined images, highlighting the potential of artificial intelligence techniques for virtual contrast enhancement in this field.
One of the most challenging fields where Artificial Intelligence (AI) can be applied is lung cancer research, specifically non-small cell lung cancer (NSCLC). In particular, overall survival (OS), the time between diagnosis and death, is a vital indicator of patient status, enabling tailored treatment and improved OS rates. In this analysis, there are two challenges to take into account. First, few studies effectively exploit the information available from each patient, leveraging both uncensored (i.e., dead) and censored (i.e., survivors) patients, considering also the events' time. Second, the handling of incomplete data is a common issue in the medical field. This problem is typically tackled through the use of imputation methods. Our objective is to present an AI model able to overcome these limits, effectively learning from both censored and uncensored patients and their available features, for the prediction of OS for NSCLC patients. We present a novel approach to survival analysis with missing values in the context of NSCLC, which exploits the strengths of the transformer architecture to account only for available features without requiring any imputation strategy. By making use of ad-hoc losses for OS, it is able to account for both censored and uncensored patients, as well as changes in risks over time. We compared our method with state-of-the-art models for survival analysis coupled with different imputation strategies. We evaluated the results obtained over a period of 6 years using different time granularities obtaining a Ct-index, a time-dependent variant of the C-index, of 71.97, 77.58 and 80.72 for time units of 1 month, 1 year and 2 years, respectively, outperforming all state-of-the-art methods regardless of the imputation method used.
One of the most challenging fields where Artificial Intelligence (AI) can be applied is lung cancer research, specifically non-small cell lung cancer (NSCLC). In particular, overall survival (OS) is a vital indicator of patient status, helping to identify subgroups with diverse survival probabilities, enabling tailored treatment and improved OS rates. In this analysis, there are two challenges to take into account. First, few studies effectively exploit the information available from each patient, leveraging both uncensored (i.e., dead) and censored (i.e., survivors) patients, considering also the death times. Second, the handling of incomplete data is a common issue in the medical field. This problem is typically tackled through the use of imputation methods. Our objective is to present an AI model able to overcome these limits, effectively learning from both censored and uncensored patients and their available features, for the prediction of OS for NSCLC patients. We present a novel approach to survival analysis in the context of NSCLC, which exploits the strengths of the transformer architecture accounting for only available features without requiring any imputation strategy. By making use of ad-hoc losses for OS, it accounts for both censored and uncensored patients, considering risks over time. We evaluated the results over a period of 6 years using different time granularities obtaining a Ct-index, a time-dependent variant of the C-index, of 71.97, 77.58 and 80.72 for time units of 1 month, 1 year and 2 years, respectively, outperforming all state-of-the-art methods regardless of the imputation method used.
Data Augmentation (DA) is a technique to increase the quantity and diversity of the training data, and by that alleviate overfitting and improve generalisation. However, standard DA produces synthetic data for augmentation with limited diversity. Generative Adversarial Networks (GANs) may unlock additional information in a dataset by generating synthetic samples having the appearance of real images. However, these models struggle to simultaneously address three key requirements: fidelity and high-quality samples; diversity and mode coverage; and fast sampling. Indeed, GANs generate high-quality samples rapidly, but have poor mode coverage, limiting their adoption in DA applications. We propose LatentAugment, a DA strategy that overcomes the low diversity of GANs, opening up for use in DA applications. Without external supervision, LatentAugment modifies latent vectors and moves them into latent space regions to maximise the synthetic images' diversity and fidelity. It is also agnostic to the dataset and the downstream task. A wide set of experiments shows that LatentAugment improves the generalisation of a deep model translating from MRI-to-CT beating both standard DA as well GAN-based sampling. Moreover, still in comparison with GAN-based sampling, LatentAugment synthetic samples show superior mode coverage and diversity. Code is available at: https://github.com/ltronchin/LatentAugment.
The integration of renewable energy sources (RES) into modern power systems has become increasingly important due to climate change and macroeconomic and geopolitical instability. Among the RES, photovoltaic (PV) energy is rapidly emerging as one of the world's most promising. However, its widespread adoption poses challenges related to its inherently uncertain nature that can lead to imbalances in the electrical system. Therefore, accurate forecasting of PV production can help resolve these uncertainties and facilitate the integration of PV into modern power systems. Currently, PV forecasting methods can be divided into two main categories: physics-based and data-based strategies, with AI-based models providing state-of-the-art performance in PV power forecasting. However, while these AI-based models can capture complex patterns and relationships in the data, they ignore the underlying physical prior knowledge of the phenomenon. Therefore, we propose MATNet, a novel self-attention transformer-based architecture for multivariate multi-step day-ahead PV power generation forecasting. It consists of a hybrid approach that combines the AI paradigm with the prior physical knowledge of PV power generation of physics-based methods. The model is fed with historical PV data and historical and forecast weather data through a multi-level joint fusion approach. The effectiveness of the proposed model is evaluated using the Ausgrid benchmark dataset with different regression performance metrics. The results show that our proposed architecture significantly outperforms the current state-of-the-art methods with an RMSE equal to 0.0460. These findings demonstrate the potential of MATNet in improving forecasting accuracy and suggest that it could be a promising solution to facilitate the integration of PV energy into the power grid.
The current deep learning approaches for low-dose CT denoising can be divided into paired and unpaired methods. The former involves the use of well-paired datasets, whilst the latter relaxes this constraint. The large availability of unpaired datasets has raised the interest in deepening unpaired denoising strategies that, in turn, need for robust evaluation techniques going beyond the qualitative evaluation. To this end, we can use quantitative image quality assessment scores that we divided into two categories, i.e., paired and unpaired measures. However, the interpretation of unpaired metrics is not straightforward, also because the consistency with paired metrics has not been fully investigated. To cope with this limitation, in this work we consider 15 paired and unpaired scores, which we applied to assess the performance of low-dose CT denoising. We perform an in-depth statistical analysis that not only studies the correlation between paired and unpaired metrics but also within each category. This brings out useful guidelines that can help researchers and practitioners select the right measure for their applications.
We are witnessing a widespread adoption of artificial intelligence in healthcare. However, most of the advancements in deep learning (DL) in this area consider only unimodal data, neglecting other modalities. Their multimodal interpretation necessary for supporting diagnosis, prognosis and treatment decisions. In this work we present a deep architecture, explainable by design, which jointly learns modality reconstructions and sample classifications using tabular and imaging data. The explanation of the decision taken is computed by applying a latent shift that, simulates a counterfactual prediction revealing the features of each modality that contribute the most to the decision and a quantitative score indicating the modality importance. We validate our approach in the context of COVID-19 pandemic using the AIforCOVID dataset, which contains multimodal data for the early identification of patients at risk of severe outcome. The results show that the proposed method provides meaningful explanations without degrading the classification performance.
The current cancer treatment practice collects multimodal data, such as radiology images, histopathology slides, genomics and clinical data. The importance of these data sources taken individually has fostered the recent raise of radiomics and pathomics, i.e. the extraction of quantitative features from radiology and histopathology images routinely collected to predict clinical outcomes or to guide clinical decisions using artificial intelligence algorithms. Nevertheless, how to combine them into a single multimodal framework is still an open issue. In this work we therefore develop a multimodal late fusion approach that combines hand-crafted features computed from radiomics, pathomics and clinical data to predict radiation therapy treatment outcomes for non-small-cell lung cancer patients. Within this context, we investigate eight different late fusion rules (i.e. product, maximum, minimum, mean, decision template, Dempster-Shafer, majority voting, and confidence rule) and two patient-wise aggregation rules leveraging the richness of information given by computer tomography images and whole-slide scans. The experiments in leave-one-patient-out cross-validation on an in-house cohort of 33 patients show that the proposed multimodal paradigm with an AUC equal to $90.9\%$ outperforms each unimodal approach, suggesting that data integration can advance precision medicine. As a further contribution, we also compare the hand-crafted representations with features automatically computed by deep networks, and the late fusion paradigm with early fusion, another popular multimodal approach. In both cases, the experiments show that the proposed multimodal approach provides the best results.
The COVID-19 pandemic has caused millions of cases and deaths and the AI-related scientific community, after being involved with detecting COVID-19 signs in medical images, has been now directing the efforts towards the development of methods that can predict the progression of the disease. This task is multimodal by its very nature and, recently, baseline results achieved on the publicly available AIforCOVID dataset have shown that chest X-ray scans and clinical information are useful to identify patients at risk of severe outcomes. While deep learning has shown superior performance in several medical fields, in most of the cases it considers unimodal data only. In this respect, when, which and how to fuse the different modalities is an open challenge in multimodal deep learning. To cope with these three questions here we present a novel approach optimizing the setup of a multimodal end-to-end model. It exploits Pareto multi-objective optimization working with a performance metric and the diversity score of multiple candidate unimodal neural networks to be fused. We test our method on the AIforCOVID dataset, attaining state-of-the-art results, not only outperforming the baseline performance but also being robust to external validation. Moreover, exploiting XAI algorithms we figure out a hierarchy among the modalities and we extract the features' intra-modality importance, enriching the trust on the predictions made by the model.