Abstract:Purpose: Diffusion MRI (dMRI) provides a diverse set of quantitative measures and derived datatypes to assess white matter microstructure and macrostructure. Coupled with the increasing size of imaging studies using dMRI, the number of downstream outputs requiring quality control (QC) will continue to grow. Previous work has shown that failure modes which are often not evident from aggregate metrics or summary statistics can be identified through structured visual inspection. This work aims to better understand common failure modes and the expected characteristics of valid dMRI processing outputs to ensure the validity and interpretability of quantitative findings. Approach: We deployed a structured QC framework to assess 18,328 dMRI scans across nine datasets, visually evaluating the outputs of seven processing pipelines representative of conventional dMRI analyses. Results: Downstream outputs that pass visual QC may still rely on failed upstream dependencies; such failures may only be visually detectable through systematic inspection of the full pipeline hierarchy. Additionally, appropriate QC granularity is algorithm-specific, as the spatial structure of each algorithm's outputs determines whether failures warrant selective or global exclusion. Conclusion: This work demonstrates the feasibility and analytical value of large-scale, structured QC for dMRI processing pipelines. Our results highlight the need for systematic QC spanning the full processing hierarchy to ensure the validity and interpretability of quantitative findings.
Abstract:In cases of prevalent diseases and disorders, such as Prenatal Alcohol Exposure (PAE), multi-site data collection allows for increased study samples. However, multi-site studies introduce additional variability through heterogeneous collection materials, such as scanner and acquisition protocols, which confound with biologically relevant signals. Neuroscientists often utilize statistical methods on image-derived metrics, such as volume of regions of interest, after all image processing to minimize site-related variance. HACA3, a deep learning harmonization method, offers an opportunity to harmonize image signals prior to metric quantification; however, HACA3 has not yet been validated in a pediatric cohort. In this work, we investigate HACA3's ability to remove site-related variance and preserve biologically relevant signal compared to a statistical method, neuroCombat, and pair HACA3 processing with neuroCombat to evaluate the efficacy of multiple harmonization methods in a pediatric (age 7 to 21) population across three unique scanners with controls and cases of PAE with downstream MaCRUISE volume metrics. We find that HACA3 qualitatively improves inter-site contrast variations, but statistical methods reduce greater site-related variance within the MaCRUISE volume metrics following an ANCOVA test, and HACA3 relies on follow-up statistical methods to approach maximal biological preservation in this context.