Alert button
Picture for Simona Bottani

Simona Bottani

Alert button

ARAMIS

Automatic quality control of brain T1-weighted magnetic resonance images for a clinical data warehouse

Apr 16, 2021
Simona Bottani, Ninon Burgos, Aurélien Maire, Adam Wild, Sebastian Ströer, Didier Dormont, Olivier Colliot

Figure 1 for Automatic quality control of brain T1-weighted magnetic resonance images for a clinical data warehouse
Figure 2 for Automatic quality control of brain T1-weighted magnetic resonance images for a clinical data warehouse
Figure 3 for Automatic quality control of brain T1-weighted magnetic resonance images for a clinical data warehouse
Figure 4 for Automatic quality control of brain T1-weighted magnetic resonance images for a clinical data warehouse

Many studies on machine learning (ML) for computer-aided diagnosis have so far been mostly restricted to high-quality research data. Clinical data warehouses, gathering routine examinations from hospitals, offer great promises for training and validation of ML models in a realistic setting. However, the use of such clinical data warehouses requires quality control (QC) tools. Visual QC by experts is time-consuming and does not scale to large datasets. In this paper, we propose a convolutional neural network (CNN) for the automatic QC of 3D T1-weighted brain MRI for a large heterogeneous clinical data warehouse. To that purpose, we used the data warehouse of the hospitals of the Greater Paris area (Assistance Publique-H\^opitaux de Paris [AP-HP]). Specifically, the objectives were: 1) to identify images which are not proper T1-weighted brain MRIs; 2) to identify acquisitions for which gadolinium was injected; 3) to rate the overall image quality. We used 5000 images for training and validation and a separate set of 500 images for testing. In order to train/validate the CNN, the data were annotated by two trained raters according to a visual QC protocol that we specifically designed for application in the setting of a data warehouse. For objectives 1 and 2, our approach achieved excellent accuracy (balanced accuracy and F1-score \textgreater 90\%), similar to the human raters. For objective 3, the performance was good but substantially lower than that of human raters. Nevertheless, the automatic approach accurately identified (balanced accuracy and F1-score \textgreater 80\%) low quality images, which would typically need to be excluded. Overall, our approach shall be useful for exploiting hospital data warehouses in medical image computing.

Viaarxiv icon

Gaussian Graphical Model exploration and selection in high dimension low sample size setting

Mar 11, 2020
Thomas Lartigue, Simona Bottani, Stephanie Baron, Olivier Colliot, Stanley Durrleman, Stéphanie Allassonnière

Figure 1 for Gaussian Graphical Model exploration and selection in high dimension low sample size setting
Figure 2 for Gaussian Graphical Model exploration and selection in high dimension low sample size setting
Figure 3 for Gaussian Graphical Model exploration and selection in high dimension low sample size setting
Figure 4 for Gaussian Graphical Model exploration and selection in high dimension low sample size setting

Gaussian Graphical Models (GGM) are often used to describe the conditional correlations between the components of a random vector. In this article, we compare two families of GGM inference methods: nodewise edge selection and penalised likelihood maximisation. We demonstrate on synthetic data that, when the sample size is small, the two methods produce graphs with either too few or too many edges when compared to the real one. As a result, we propose a composite procedure that explores a family of graphs with an nodewise numerical scheme and selects a candidate among them with an overall likelihood criterion. We demonstrate that, when the number of observations is small, this selection method yields graphs closer to the truth and corresponding to distributions with better KL divergence with regards to the real distribution than the other two. Finally, we show the interest of our algorithm on two concrete cases: first on brain imaging data, then on biological nephrology data. In both cases our results are more in line with current knowledge in each field.

Viaarxiv icon

Convolutional Neural Networks for Classification of Alzheimer's Disease: Overview and Reproducible Evaluation

Apr 16, 2019
Junhao Wen, Elina Thibeau-Sutre, Jorge Samper-Gonzalez, Alexandre Routier, Simona Bottani, Stanley Durrleman, Ninon Burgos, Olivier Colliot

Figure 1 for Convolutional Neural Networks for Classification of Alzheimer's Disease: Overview and Reproducible Evaluation
Figure 2 for Convolutional Neural Networks for Classification of Alzheimer's Disease: Overview and Reproducible Evaluation
Figure 3 for Convolutional Neural Networks for Classification of Alzheimer's Disease: Overview and Reproducible Evaluation
Figure 4 for Convolutional Neural Networks for Classification of Alzheimer's Disease: Overview and Reproducible Evaluation

In the past two years, over 30 papers have proposed to use convolutional neural network (CNN) for AD classification. However, the classification performances across studies are difficult to compare. Moreover, these studies are hardly reproducible because their frameworks are not publicly accessible. Lastly, some of these papers may reported biased performances due to inadequate or unclear validation procedure and also it is unclear how the model architecture and parameters were chosen. In the present work, we aim to address these limitations through three main contributions. First, we performed a systematic literature review of studies using CNN for AD classification from anatomical MRI. We identified four main types of approaches: 2D slice-level, 3D patch-level, ROI-based and 3D subject-level CNN. Moreover, we found that more than half of the surveyed papers may have suffered from data leakage and thus reported biased performances. Our second contribution is an open-source framework for classification of AD. Thirdly, we used this framework to rigorously compare different CNN architectures, which are representative of the existing literature, and to study the influence of key components on classification performances. On the validation set, the ROI-based (hippocampus) CNN achieved highest balanced accuracy (0.86 for AD vs CN and 0.80 for sMCI vs pMCI) compared to other approaches. Transfer learning with autoencoder pre-training did not improve the average accuracy but reduced the variance. Training using longitudinal data resulted in similar or higher performance, depending on the approach, compared to training with only baseline data. Sophisticated image preprocessing did not improve the results. Lastly, CNN performed similarly to standard SVM for task AD vs CN but outperformed SVM for task sMCI vs pMCI, demonstrating the potential of deep learning for challenging diagnostic tasks.

Viaarxiv icon

Reproducible evaluation of diffusion MRI features for automatic classification of patients with Alzheimers disease

Dec 28, 2018
Junhao Wen, Jorge Samper-Gonzalez, Simona Bottani, Alexandre Routier, Ninon Burgos, Thomas Jacquemont, Sabrina Fontanella, Stanley Durrleman, Stephane Epelbaum, Anne Bertrand, Olivier Colliot

Figure 1 for Reproducible evaluation of diffusion MRI features for automatic classification of patients with Alzheimers disease
Figure 2 for Reproducible evaluation of diffusion MRI features for automatic classification of patients with Alzheimers disease
Figure 3 for Reproducible evaluation of diffusion MRI features for automatic classification of patients with Alzheimers disease
Figure 4 for Reproducible evaluation of diffusion MRI features for automatic classification of patients with Alzheimers disease

Diffusion MRI is the modality of choice to study alterations of white matter. In the past years, various works have used diffusion MRI for automatic classification of Alzheimers disease. However, the performances obtained with different approaches are difficult to compare because of variations in components such as input data, participant selection, image preprocessing, feature extraction, feature selection (FS) and cross-validation (CV) procedure. Moreover, these studies are also difficult to reproduce because these different components are not readily available. In a previous work (Samper-Gonzalez et al. 2018), we proposed an open-source framework for the reproducible evaluation of AD classification from T1-weighted (T1w) MRI and PET data. In the present paper, we extend this framework to diffusion MRI data. The framework comprises: tools to automatically convert ADNI data into the BIDS standard, pipelines for image preprocessing and feature extraction, baseline classifiers and a rigorous CV procedure. We demonstrate the use of the framework through assessing the influence of diffusion tensor imaging (DTI) metrics (fractional anisotropy - FA, mean diffusivity - MD), feature types, imaging modalities (diffusion MRI or T1w MRI), data imbalance and FS bias. First, voxel-wise features generally gave better performances than regional features. Secondly, FA and MD provided comparable results for voxel-wise features. Thirdly, T1w MRI performed better than diffusion MRI. Fourthly, we demonstrated that using non-nested validation of FS leads to unreliable and over-optimistic results. All the code is publicly available: general-purpose tools have been integrated into the Clinica software (www.clinica.run) and the paper-specific code is available at: https://gitlab.icm-institute.org/aramislab/AD-ML.

* 51 pages, 5 figure and 6 tables 
Viaarxiv icon

Reproducible evaluation of classification methods in Alzheimer's disease: framework and application to MRI and PET data

Aug 20, 2018
Jorge Samper-González, Ninon Burgos, Simona Bottani, Sabrina Fontanella, Pascal Lu, Arnaud Marcoux, Alexandre Routier, Jérémy Guillon, Michael Bacci, Junhao Wen, Anne Bertrand, Hugo Bertin, Marie-Odile Habert, Stanley Durrleman, Theodoros Evgeniou, Olivier Colliot, for the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers, Lifestyle flagship study of ageing

Figure 1 for Reproducible evaluation of classification methods in Alzheimer's disease: framework and application to MRI and PET data
Figure 2 for Reproducible evaluation of classification methods in Alzheimer's disease: framework and application to MRI and PET data
Figure 3 for Reproducible evaluation of classification methods in Alzheimer's disease: framework and application to MRI and PET data
Figure 4 for Reproducible evaluation of classification methods in Alzheimer's disease: framework and application to MRI and PET data

A large number of papers have introduced novel machine learning and feature extraction methods for automatic classification of AD. However, they are difficult to reproduce because key components of the validation are often not readily available. These components include selected participants and input data, image preprocessing and cross-validation procedures. The performance of the different approaches is also difficult to compare objectively. In particular, it is often difficult to assess which part of the method provides a real improvement, if any. We propose a framework for reproducible and objective classification experiments in AD using three publicly available datasets (ADNI, AIBL and OASIS). The framework comprises: i) automatic conversion of the three datasets into BIDS format, ii) a modular set of preprocessing pipelines, feature extraction and classification methods, together with an evaluation framework, that provide a baseline for benchmarking the different components. We demonstrate the use of the framework for a large-scale evaluation on 1960 participants using T1 MRI and FDG PET data. In this evaluation, we assess the influence of different modalities, preprocessing, feature types, classifiers, training set sizes and datasets. Performances were in line with the state-of-the-art. FDG PET outperformed T1 MRI for all classification tasks. No difference in performance was found for the use of different atlases, image smoothing, partial volume correction of FDG PET images, or feature type. Linear SVM and L2-logistic regression resulted in similar performance and both outperformed random forests. The classification performance increased along with the number of subjects used for training. Classifiers trained on ADNI generalized well to AIBL and OASIS. All the code of the framework and the experiments is publicly available at: https://gitlab.icm-institute.org/aramislab/AD-ML.

Viaarxiv icon