Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel C. Alexander

Hawkes Institute, Department of Computer Science, University College London, London, United Kingdom

4D VQ-GAN: Synthesising Medical Scans at Any Time Point for Personalised Disease Progression Modelling of Idiopathic Pulmonary Fibrosis

Feb 08, 2025

An Zhao, Moucheng Xu, Ahmed H. Shahin, Wim Wuyts, Mark G. Jones, Joseph Jacob, Daniel C. Alexander

Abstract:Understanding the progression trajectories of diseases is crucial for early diagnosis and effective treatment planning. This is especially vital for life-threatening conditions such as Idiopathic Pulmonary Fibrosis (IPF), a chronic, progressive lung disease with a prognosis comparable to many cancers. Computed tomography (CT) imaging has been established as a reliable diagnostic tool for IPF. Accurately predicting future CT scans of early-stage IPF patients can aid in developing better treatment strategies, thereby improving survival outcomes. In this paper, we propose 4D Vector Quantised Generative Adversarial Networks (4D-VQ-GAN), a model capable of generating realistic CT volumes of IPF patients at any time point. The model is trained using a two-stage approach. In the first stage, a 3D-VQ-GAN is trained to reconstruct CT volumes. In the second stage, a Neural Ordinary Differential Equation (ODE) based temporal model is trained to capture the temporal dynamics of the quantised embeddings generated by the encoder in the first stage. We evaluate different configurations of our model for generating longitudinal CT scans and compare the results against ground truth data, both quantitatively and qualitatively. For validation, we conduct survival analysis using imaging biomarkers derived from generated CT scans and achieve a C-index comparable to that of biomarkers derived from the real CT scans. The survival analysis results demonstrate the potential clinical utility inherent to generated longitudinal CT scans, showing that they can reliably predict survival outcomes.

* 4D image synthesis, VQ-GAN, neural ODEs, spatial temporal disease progression modelling, CT, IPF

Via

Access Paper or Ask Questions

MRI Parameter Mapping via Gaussian Mixture VAE: Breaking the Assumption of Independent Pixels

Nov 16, 2024

Moucheng Xu, Yukun Zhou, Tobias Goodwin-Allcock, Kimia Firoozabadi, Joseph Jacob, Daniel C. Alexander, Paddy J. Slator

Figure 1 for MRI Parameter Mapping via Gaussian Mixture VAE: Breaking the Assumption of Independent Pixels

Figure 2 for MRI Parameter Mapping via Gaussian Mixture VAE: Breaking the Assumption of Independent Pixels

Figure 3 for MRI Parameter Mapping via Gaussian Mixture VAE: Breaking the Assumption of Independent Pixels

Figure 4 for MRI Parameter Mapping via Gaussian Mixture VAE: Breaking the Assumption of Independent Pixels

Abstract:We introduce and demonstrate a new paradigm for quantitative parameter mapping in MRI. Parameter mapping techniques, such as diffusion MRI and quantitative MRI, have the potential to robustly and repeatably measure biologically-relevant tissue maps that strongly relate to underlying microstructure. Quantitative maps are calculated by fitting a model to multiple images, e.g. with least-squares or machine learning. However, the overwhelming majority of model fitting techniques assume that each voxel is independent, ignoring any co-dependencies in the data. This makes model fitting sensitive to voxelwise measurement noise, hampering reliability and repeatability. We propose a self-supervised deep variational approach that breaks the assumption of independent pixels, leveraging redundancies in the data to effectively perform data-driven regularisation of quantitative maps. We demonstrate that our approach outperforms current model fitting techniques in dMRI simulations and real data. Especially with a Gaussian mixture prior, our model enables sharper quantitative maps, revealing finer anatomical details that are not presented in the baselines. Our approach can hence support the clinical adoption of parameter mapping methods such as dMRI and qMRI.

* NeurIPS 2024 Workshop in Machine Learning and the Physical Sciences

Via

Access Paper or Ask Questions

Alternative Learning Paradigms for Image Quality Transfer

Nov 08, 2024

Ahmed Karam Eldaly, Matteo Figini, Daniel C. Alexander

Figure 1 for Alternative Learning Paradigms for Image Quality Transfer

Figure 2 for Alternative Learning Paradigms for Image Quality Transfer

Figure 3 for Alternative Learning Paradigms for Image Quality Transfer

Figure 4 for Alternative Learning Paradigms for Image Quality Transfer

Abstract:Image Quality Transfer (IQT) aims to enhance the contrast and resolution of low-quality medical images, e.g. obtained from low-power devices, with rich information learned from higher quality images. In contrast to existing IQT methods which adopt supervised learning frameworks, in this work, we propose two novel formulations of the IQT problem. The first approach uses an unsupervised learning framework, whereas the second is a combination of both supervised and unsupervised learning. The unsupervised learning approach considers a sparse representation (SRep) and dictionary learning model, which we call IQT-SRep, whereas the combination of supervised and unsupervised learning approach is based on deep dictionary learning (DDL), which we call IQT-DDL. The IQT-SRep approach trains two dictionaries using a SRep model using pairs of low- and high-quality volumes. Subsequently, the SRep of a low-quality block, in terms of the low-quality dictionary, can be directly used to recover the corresponding high-quality block using the high-quality dictionary. On the other hand, the IQT-DDL approach explicitly learns a high-resolution dictionary to upscale the input volume, while the entire network, including high dictionary generator, is simultaneously optimised to take full advantage of deep learning methods. The two models are evaluated using a low-field magnetic resonance imaging (MRI) application aiming to recover high-quality images akin to those obtained from high-field scanners. Experiments comparing the proposed approaches against state-of-the-art supervised deep learning IQT method (IQT-DL) identify that the two novel formulations of the IQT problem can avoid bias associated with supervised methods when tested using out-of-distribution data that differs from the distribution of the data the model was trained on. This highlights the potential benefit of these novel paradigms for IQT.

* Machine.Learning.for.Biomedical.Imaging. 2 (2023)
* Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2024:027

Via

Access Paper or Ask Questions

Unscrambling disease progression at scale: fast inference of event permutations with optimal transport

Oct 18, 2024

Peter A. Wijeratne, Daniel C. Alexander

Figure 1 for Unscrambling disease progression at scale: fast inference of event permutations with optimal transport

Figure 2 for Unscrambling disease progression at scale: fast inference of event permutations with optimal transport

Figure 3 for Unscrambling disease progression at scale: fast inference of event permutations with optimal transport

Figure 4 for Unscrambling disease progression at scale: fast inference of event permutations with optimal transport

Abstract:Disease progression models infer group-level temporal trajectories of change in patients' features as a chronic degenerative condition plays out. They provide unique insight into disease biology and staging systems with individual-level clinical utility. Discrete models consider disease progression as a latent permutation of events, where each event corresponds to a feature becoming measurably abnormal. However, permutation inference using traditional maximum likelihood approaches becomes prohibitive due to combinatoric explosion, severely limiting model dimensionality and utility. Here we leverage ideas from optimal transport to model disease progression as a latent permutation matrix of events belonging to the Birkhoff polytope, facilitating fast inference via optimisation of the variational lower bound. This enables a factor of 1000 times faster inference than the current state of the art and, correspondingly, supports models with several orders of magnitude more features than the current state of the art can consider. Experiments demonstrate the increase in speed, accuracy and robustness to noise in simulation. Further experiments with real-world imaging data from two separate datasets, one from Alzheimer's disease patients, the other age-related macular degeneration, showcase, for the first time, pixel-level disease progression events in the brain and eye, respectively. Our method is low compute, interpretable and applicable to any progressive condition and data modality, giving it broad potential clinical utility.

* Pre-print of version accepted to NeurIPS 2024

Via

Access Paper or Ask Questions

An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation

Oct 04, 2024

Ahmed Abdulaal, Hugo Fry, Nina Montaña-Brown, Ayodeji Ijishakin, Jack Gao, Stephanie Hyland, Daniel C. Alexander, Daniel C. Castro

Abstract:Radiological services are experiencing unprecedented demand, leading to increased interest in automating radiology report generation. Existing Vision-Language Models (VLMs) suffer from hallucinations, lack interpretability, and require expensive fine-tuning. We introduce SAE-Rad, which uses sparse autoencoders (SAEs) to decompose latent representations from a pre-trained vision transformer into human-interpretable features. Our hybrid architecture combines state-of-the-art SAE advancements, achieving accurate latent reconstructions while maintaining sparsity. Using an off-the-shelf language model, we distil ground-truth reports into radiological descriptions for each SAE feature, which we then compile into a full report for each image, eliminating the need for fine-tuning large models for this task. To the best of our knowledge, SAE-Rad represents the first instance of using mechanistic interpretability techniques explicitly for a downstream multi-modal reasoning task. On the MIMIC-CXR dataset, SAE-Rad achieves competitive radiology-specific metrics compared to state-of-the-art models while using significantly fewer computational resources for training. Qualitative analysis reveals that SAE-Rad learns meaningful visual concepts and generates reports aligning closely with expert interpretations. Our results suggest that SAEs can enhance multimodal reasoning in healthcare, providing a more interpretable alternative to existing VLMs.

Via

Access Paper or Ask Questions

DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model

Aug 30, 2024

Mona Sheikh Zeinoddin, Chiara Lena, Jiongqi Qu, Luca Carlini, Mattia Magro, Seunghoi Kim, Elena De Momi, Sophia Bano, Matthew Grech-Sollars, Evangelos Mazomenos(+4 more)

Figure 1 for DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model

Figure 2 for DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model

Figure 3 for DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model

Figure 4 for DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model

Abstract:Robotic-assisted surgery (RAS) relies on accurate depth estimation for 3D reconstruction and visualization. While foundation models like Depth Anything Models (DAM) show promise, directly applying them to surgery often yields suboptimal results. Fully fine-tuning on limited surgical data can cause overfitting and catastrophic forgetting, compromising model robustness and generalization. Although Low-Rank Adaptation (LoRA) addresses some adaptation issues, its uniform parameter distribution neglects the inherent feature hierarchy, where earlier layers, learning more general features, require more parameters than later ones. To tackle this issue, we introduce Depth Anything in Robotic Endoscopic Surgery (DARES), a novel approach that employs a new adaptation technique, Vector Low-Rank Adaptation (Vector-LoRA) on the DAM V2 to perform self-supervised monocular depth estimation in RAS scenes. To enhance learning efficiency, we introduce Vector-LoRA by integrating more parameters in earlier layers and gradually decreasing parameters in later layers. We also design a reprojection loss based on the multi-scale SSIM error to enhance depth perception by better tailoring the foundation model to the specific requirements of the surgical environment. The proposed method is validated on the SCARED dataset and demonstrates superior performance over recent state-of-the-art self-supervised monocular depth estimation techniques, achieving an improvement of 13.3% in the absolute relative error metric. The code and pre-trained weights are available at https://github.com/mobarakol/DARES.

* 11 pages

Via

Access Paper or Ask Questions

Image Quality Transfer of Diffusion MRI Guided By High-Resolution Structural MRI

Aug 06, 2024

Alp G. Cicimen, Henry F. J. Tregidgo, Matteo Figini, Eirini Messaritaki, Carolyn B. McNabb, Marco Palombo, C. John Evans, Mara Cercignani, Derek K. Jones, Daniel C. Alexander

Figure 1 for Image Quality Transfer of Diffusion MRI Guided By High-Resolution Structural MRI

Figure 2 for Image Quality Transfer of Diffusion MRI Guided By High-Resolution Structural MRI

Figure 3 for Image Quality Transfer of Diffusion MRI Guided By High-Resolution Structural MRI

Figure 4 for Image Quality Transfer of Diffusion MRI Guided By High-Resolution Structural MRI

Abstract:Prior work on the Image Quality Transfer on Diffusion MRI (dMRI) has shown significant improvement over traditional interpolation methods. However, the difficulty in obtaining ultra-high resolution Diffusion MRI scans poses a problem in training neural networks to obtain high-resolution dMRI scans. Here we hypothesise that the inclusion of structural MRI images, which can be acquired at much higher resolutions, can be used as a guide to obtaining a more accurate high-resolution dMRI output. To test our hypothesis, we have constructed a novel framework that incorporates structural MRI scans together with dMRI to obtain high-resolution dMRI scans. We set up tests which evaluate the validity of our claim through various configurations and compare the performance of our approach against a unimodal approach. Our results show that the inclusion of structural MRI scans do lead to an improvement in high-resolution image prediction when T1w data is incorporated into the model input.

Via

Access Paper or Ask Questions

SCREENER: A general framework for task-specific experiment design in quantitative MRI

Aug 06, 2024

Tianshu Zheng, Zican Wang, Timothy Bray, Daniel C. Alexander, Dan Wu, Hui Zhang

Abstract:Quantitative magnetic resonance imaging (qMRI) is increasingly investigated for use in a variety of clinical tasks from diagnosis, through staging, to treatment monitoring. However, experiment design in qMRI, the identification of the optimal acquisition protocols, has been focused on obtaining the most precise parameter estimations, with no regard for the specific requirements of downstream tasks. Here we propose SCREENER: A general framework for task-specific experiment design in quantitative MRI. SCREENER incorporates a task-specific objective and seeks the optimal protocol with a deep-reinforcement-learning (DRL) based optimization strategy. To illustrate this framework, we employ a task of classifying the inflammation status of bone marrow using diffusion MRI data with intravoxel incoherent motion (IVIM) modelling. Results demonstrate SCREENER outperforms previous ad hoc and optimized protocols under clinical signal-to-noise ratio (SNR) conditions, achieving significant improvement, both in binary classification tasks, e.g. from 67% to 89%, and in a multi-class classification task, from 46% to 59%. Additionally, we show this improvement is robust to the SNR. Lastly, we demonstrate the advantage of DRL-based optimization strategy, enabling zero-shot discovery of near-optimal protocols for a range of SNRs not used in training. In conclusion, SCREENER has the potential to enable wider uptake of qMRI in the clinic.

Via

Access Paper or Ask Questions

Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge

May 06, 2024

Lemuel Puglisi, Daniel C. Alexander, Daniele Ravì

Figure 1 for Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge

Figure 2 for Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge

Figure 3 for Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge

Abstract:In this work, we introduce Brain Latent Progression (BrLP), a novel spatiotemporal disease progression model based on latent diffusion. BrLP is designed to predict the evolution of diseases at the individual level on 3D brain MRIs. Existing deep generative models developed for this task are primarily data-driven and face challenges in learning disease progressions. BrLP addresses these challenges by incorporating prior knowledge from disease models to enhance the accuracy of predictions. To implement this, we propose to integrate an auxiliary model that infers volumetric changes in various brain regions. Additionally, we introduce Latent Average Stabilization (LAS), a novel technique to improve spatiotemporal consistency of the predicted progression. BrLP is trained and evaluated on a large dataset comprising 11,730 T1-weighted brain MRIs from 2,805 subjects, collected from three publicly available, longitudinal Alzheimer's Disease (AD) studies. In our experiments, we compare the MRI scans generated by BrLP with the actual follow-up MRIs available from the subjects, in both cross-sectional and longitudinal settings. BrLP demonstrates significant improvements over existing methods, with an increase of 22% in volumetric accuracy across AD-related brain regions and 43% in image similarity to the ground-truth scans. The ability of BrLP to generate conditioned 3D scans at the subject level, along with the novelty of integrating prior knowledge to enhance accuracy, represents a significant advancement in disease progression modeling, opening new avenues for precision medicine. The code of BrLP is available at the following link: https://github.com/LemuelPuglisi/BrLP.

Via

Access Paper or Ask Questions

Tackling Structural Hallucination in Image Translation with Local Diffusion

Apr 13, 2024

Seunghoi Kim, Chen Jin, Tom Diethe, Matteo Figini, Henry F. J. Tregidgo, Asher Mullokandov, Philip Teare, Daniel C. Alexander

Figure 1 for Tackling Structural Hallucination in Image Translation with Local Diffusion

Figure 2 for Tackling Structural Hallucination in Image Translation with Local Diffusion

Figure 3 for Tackling Structural Hallucination in Image Translation with Local Diffusion

Figure 4 for Tackling Structural Hallucination in Image Translation with Local Diffusion

Abstract:Recent developments in diffusion models have advanced conditioned image generation, yet they struggle with reconstructing out-of-distribution (OOD) images, such as unseen tumors in medical images, causing ``image hallucination'' and risking misdiagnosis. We hypothesize such hallucinations result from local OOD regions in the conditional images. We verify that partitioning the OOD region and conducting separate image generations alleviates hallucinations in several applications. From this, we propose a training-free diffusion framework that reduces hallucination with multiple Local Diffusion processes. Our approach involves OOD estimation followed by two modules: a ``branching'' module generates locally both within and outside OOD regions, and a ``fusion'' module integrates these predictions into one. Our evaluation shows our method mitigates hallucination over baseline models quantitatively and qualitatively, reducing misdiagnosis by 40% and 25% in the real-world medical and natural image datasets, respectively. It also demonstrates compatibility with various pre-trained diffusion models.

Via

Access Paper or Ask Questions