Alert button
Picture for Mert R. Sabuncu

Mert R. Sabuncu

Alert button

Empirical Analysis of a Segmentation Foundation Model in Prostate Imaging

Jul 06, 2023
Heejong Kim, Victor Ion Butoi, Adrian V. Dalca, Mert R. Sabuncu

Figure 1 for Empirical Analysis of a Segmentation Foundation Model in Prostate Imaging
Figure 2 for Empirical Analysis of a Segmentation Foundation Model in Prostate Imaging
Figure 3 for Empirical Analysis of a Segmentation Foundation Model in Prostate Imaging
Figure 4 for Empirical Analysis of a Segmentation Foundation Model in Prostate Imaging

Most state-of-the-art techniques for medical image segmentation rely on deep-learning models. These models, however, are often trained on narrowly-defined tasks in a supervised fashion, which requires expensive labeled datasets. Recent advances in several machine learning domains, such as natural language generation have demonstrated the feasibility and utility of building foundation models that can be customized for various downstream tasks with little to no labeled data. This likely represents a paradigm shift for medical imaging, where we expect that foundation models may shape the future of the field. In this paper, we consider a recently developed foundation model for medical image segmentation, UniverSeg. We conduct an empirical evaluation study in the context of prostate imaging and compare it against the conventional approach of training a task-specific segmentation model. Our results and discussion highlight several important factors that will likely be important in the development and adoption of foundation models for medical image segmentation.

* Under Review 
Viaarxiv icon

Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources

May 13, 2023
Suraj Rajendran, Weishen Pan, Mert R. Sabuncu, Yong Chen, Jiayu Zhou, Fei Wang

Figure 1 for Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources
Figure 2 for Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources
Figure 3 for Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources
Figure 4 for Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources

Machine learning (ML) in healthcare presents numerous opportunities for enhancing patient care, population health, and healthcare providers' workflows. However, the real-world clinical and cost benefits remain limited due to challenges in data privacy, heterogeneous data sources, and the inability to fully leverage multiple data modalities. In this perspective paper, we introduce "patchwork learning" (PL), a novel paradigm that addresses these limitations by integrating information from disparate datasets composed of different data modalities (e.g., clinical free-text, medical images, omics) and distributed across separate and secure sites. PL allows the simultaneous utilization of complementary data sources while preserving data privacy, enabling the development of more holistic and generalizable ML models. We present the concept of patchwork learning and its current implementations in healthcare, exploring the potential opportunities and applicable data sources for addressing various healthcare challenges. PL leverages bridging modalities or overlapping feature spaces across sites to facilitate information sharing and impute missing data, thereby addressing related prediction tasks. We discuss the challenges associated with PL, many of which are shared by federated and multimodal learning, and provide recommendations for future research in this field. By offering a more comprehensive approach to healthcare data integration, patchwork learning has the potential to revolutionize the clinical applicability of ML models. This paradigm promises to strike a balance between personalization and generalizability, ultimately enhancing patient experiences, improving population health, and optimizing healthcare providers' workflows.

Viaarxiv icon

A robust and interpretable deep learning framework for multi-modal registration via keypoints

Apr 19, 2023
Alan Q. Wang, Evan M. Yu, Adrian V. Dalca, Mert R. Sabuncu

Figure 1 for A robust and interpretable deep learning framework for multi-modal registration via keypoints
Figure 2 for A robust and interpretable deep learning framework for multi-modal registration via keypoints
Figure 3 for A robust and interpretable deep learning framework for multi-modal registration via keypoints
Figure 4 for A robust and interpretable deep learning framework for multi-modal registration via keypoints

We present KeyMorph, a deep learning-based image registration framework that relies on automatically detecting corresponding keypoints. State-of-the-art deep learning methods for registration often are not robust to large misalignments, are not interpretable, and do not incorporate the symmetries of the problem. In addition, most models produce only a single prediction at test-time. Our core insight which addresses these shortcomings is that corresponding keypoints between images can be used to obtain the optimal transformation via a differentiable closed-form expression. We use this observation to drive the end-to-end learning of keypoints tailored for the registration task, and without knowledge of ground-truth keypoints. This framework not only leads to substantially more robust registration but also yields better interpretability, since the keypoints reveal which parts of the image are driving the final alignment. Moreover, KeyMorph can be designed to be equivariant under image translations and/or symmetric with respect to the input image ordering. Finally, we show how multiple deformation fields can be computed efficiently and in closed-form at test time corresponding to different transformation variants. We demonstrate the proposed framework in solving 3D affine and spline-based registration of multi-modal brain MRI scans. In particular, we show registration accuracy that surpasses current state-of-the-art methods, especially in the context of large displacements. Our code is available at https://github.com/evanmy/keymorph.

Viaarxiv icon

Learning to Compare Longitudinal Images

Apr 16, 2023
Heejong Kim, Mert R. Sabuncu

Figure 1 for Learning to Compare Longitudinal Images
Figure 2 for Learning to Compare Longitudinal Images
Figure 3 for Learning to Compare Longitudinal Images
Figure 4 for Learning to Compare Longitudinal Images

Longitudinal studies, where a series of images from the same set of individuals are acquired at different time-points, represent a popular technique for studying and characterizing temporal dynamics in biomedical applications. The classical approach for longitudinal comparison involves normalizing for nuisance variations, such as image orientation or contrast differences, via pre-processing. Statistical analysis is, in turn, conducted to detect changes of interest, either at the individual or population level. This classical approach can suffer from pre-processing issues and limitations of the statistical modeling. For example, normalizing for nuisance variation might be hard in settings where there are a lot of idiosyncratic changes. In this paper, we present a simple machine learning-based approach that can alleviate these issues. In our approach, we train a deep learning model (called PaIRNet, for Pairwise Image Ranking Network) to compare pairs of longitudinal images, with or without supervision. In the self-supervised setup, for instance, the model is trained to temporally order the images, which requires learning to recognize time-irreversible changes. Our results from four datasets demonstrate that PaIRNet can be very effective in localizing and quantifying meaningful longitudinal changes while discounting nuisance variation. Our code is available at \url{https://github.com/heejong-kim/learning-to-compare-longitudinal-images.git}

* to be published in MIDL 2023 
Viaarxiv icon

UniverSeg: Universal Medical Image Segmentation

Apr 12, 2023
Victor Ion Butoi, Jose Javier Gonzalez Ortiz, Tianyu Ma, Mert R. Sabuncu, John Guttag, Adrian V. Dalca

Figure 1 for UniverSeg: Universal Medical Image Segmentation
Figure 2 for UniverSeg: Universal Medical Image Segmentation
Figure 3 for UniverSeg: Universal Medical Image Segmentation
Figure 4 for UniverSeg: Universal Medical Image Segmentation

While deep learning models have become the predominant method for medical image segmentation, they are typically not capable of generalizing to unseen segmentation tasks involving new anatomies, image modalities, or labels. Given a new segmentation task, researchers generally have to train or fine-tune models, which is time-consuming and poses a substantial barrier for clinical researchers, who often lack the resources and expertise to train neural networks. We present UniverSeg, a method for solving unseen medical segmentation tasks without additional training. Given a query image and example set of image-label pairs that define a new segmentation task, UniverSeg employs a new Cross-Block mechanism to produce accurate segmentation maps without the need for additional training. To achieve generalization to new tasks, we have gathered and standardized a collection of 53 open-access medical segmentation datasets with over 22,000 scans, which we refer to as MegaMedical. We used this collection to train UniverSeg on a diverse set of anatomies and imaging modalities. We demonstrate that UniverSeg substantially outperforms several related methods on unseen tasks, and thoroughly analyze and draw insights about important aspects of the proposed system. The UniverSeg source code and model weights are freely available at https://universeg.csail.mit.edu

* Victor and Jose Javier contributed equally to this work. Project Website: https://universeg.csail.mit.edu 
Viaarxiv icon

Neural Pre-Processing: A Learning Framework for End-to-end Brain MRI Pre-processing

Mar 21, 2023
Xinzi He, Alan Wang, Mert R. Sabuncu

Figure 1 for Neural Pre-Processing: A Learning Framework for End-to-end Brain MRI Pre-processing
Figure 2 for Neural Pre-Processing: A Learning Framework for End-to-end Brain MRI Pre-processing
Figure 3 for Neural Pre-Processing: A Learning Framework for End-to-end Brain MRI Pre-processing
Figure 4 for Neural Pre-Processing: A Learning Framework for End-to-end Brain MRI Pre-processing

Head MRI pre-processing involves converting raw images to an intensity-normalized, skull-stripped brain in a standard coordinate space. In this paper, we propose an end-to-end weakly supervised learning approach, called Neural Pre-processing (NPP), for solving all three sub-tasks simultaneously via a neural network, trained on a large dataset without individual sub-task supervision. Because the overall objective is highly under-constrained, we explicitly disentangle geometric-preserving intensity mapping (skull-stripping and intensity normalization) and spatial transformation (spatial normalization). Quantitative results show that our model outperforms state-of-the-art methods which tackle only a single sub-task. Our ablation experiments demonstrate the importance of the architecture design we chose for NPP. Furthermore, NPP affords the user the flexibility to control each of these tasks at inference time. The code and model are freely-available at \url{https://github.com/Novestars/Neural-Pre-processing}.

* 8 
Viaarxiv icon

A Simple Nadaraya-Watson Head can offer Explainable and Calibrated Classification

Dec 07, 2022
Alan Q. Wang, Mert R. Sabuncu

Figure 1 for A Simple Nadaraya-Watson Head can offer Explainable and Calibrated Classification
Figure 2 for A Simple Nadaraya-Watson Head can offer Explainable and Calibrated Classification
Figure 3 for A Simple Nadaraya-Watson Head can offer Explainable and Calibrated Classification
Figure 4 for A Simple Nadaraya-Watson Head can offer Explainable and Calibrated Classification

In this paper, we empirically analyze a simple, non-learnable, and nonparametric Nadaraya-Watson (NW) prediction head that can be used with any neural network architecture. In the NW head, the prediction is a weighted average of labels from a support set. The weights are computed from distances between the query feature and support features. This is in contrast to the dominant approach of using a learnable classification head (e.g., a fully-connected layer) on the features, which can be challenging to interpret and can yield poorly calibrated predictions. Our empirical results on an array of computer vision tasks demonstrate that the NW head can yield better calibration than its parametric counterpart, while having comparable accuracy and with minimal computational overhead. To further increase inference-time efficiency, we propose a simple approach that involves a clustering step run on the training set to create a relatively small distilled support set. In addition to using the weights as a means of interpreting model predictions, we further present an easy-to-compute "support influence function," which quantifies the influence of a support element on the prediction for a given query. As we demonstrate in our experiments, the influence function can allow the user to debug a trained model. We believe that the NW head is a flexible, interpretable, and highly useful building block that can be used in a range of applications.

Viaarxiv icon

LARO: Learned Acquisition and Reconstruction Optimization to accelerate Quantitative Susceptibility Mapping

Nov 01, 2022
Jinwei Zhang, Pascal Spincemaille, Hang Zhang, Thanh D. Nguyen, Chao Li, Jiahao Li, Ilhami Kovanlikaya, Mert R. Sabuncu, Yi Wang

Figure 1 for LARO: Learned Acquisition and Reconstruction Optimization to accelerate Quantitative Susceptibility Mapping
Figure 2 for LARO: Learned Acquisition and Reconstruction Optimization to accelerate Quantitative Susceptibility Mapping
Figure 3 for LARO: Learned Acquisition and Reconstruction Optimization to accelerate Quantitative Susceptibility Mapping
Figure 4 for LARO: Learned Acquisition and Reconstruction Optimization to accelerate Quantitative Susceptibility Mapping

Quantitative susceptibility mapping (QSM) involves acquisition and reconstruction of a series of images at multi-echo time points to estimate tissue field, which prolongs scan time and requires specific reconstruction technique. In this paper, we present our new framework, called Learned Acquisition and Reconstruction Optimization (LARO), which aims to accelerate the multi-echo gradient echo (mGRE) pulse sequence for QSM. Our approach involves optimizing a Cartesian multi-echo k-space sampling pattern with a deep reconstruction network. Next, this optimized sampling pattern was implemented in an mGRE sequence using Cartesian fan-beam k-space segmenting and ordering for prospective scans. Furthermore, we propose to insert a recurrent temporal feature fusion module into the reconstruction network to capture signal redundancies along echo time. Our ablation studies show that both the optimized sampling pattern and proposed reconstruction strategy help improve the quality of the multi-echo image reconstructions. Generalization experiments show that LARO is robust on the test data with new pathologies and different sequence parameters. Our code is available at https://github.com/Jinwei1209/LARO.git.

Viaarxiv icon

GLACIAL: Granger and Learning-based Causality Analysis for Longitudinal Studies

Oct 13, 2022
Minh Nguyen, Gia H. Ngo, Mert R. Sabuncu

Figure 1 for GLACIAL: Granger and Learning-based Causality Analysis for Longitudinal Studies
Figure 2 for GLACIAL: Granger and Learning-based Causality Analysis for Longitudinal Studies
Figure 3 for GLACIAL: Granger and Learning-based Causality Analysis for Longitudinal Studies
Figure 4 for GLACIAL: Granger and Learning-based Causality Analysis for Longitudinal Studies

The Granger framework is widely used for discovering causal relationships based on time-varying signals. Implementations of Granger causality (GC) are mostly developed for densely sampled timeseries data. A substantially different setting, particularly common in population health applications, is the longitudinal study design, where multiple individuals are followed and sparsely observed for a limited number of times. Longitudinal studies commonly track many variables, which are likely governed by nonlinear dynamics that might have individual-specific idiosyncrasies and exhibit both direct and indirect causes. Furthermore, real-world longitudinal data often suffer from widespread missingness. GC methods are not well-suited to handle these issues. In this paper, we intend to fill this methodological gap. We propose to marry the GC framework with a machine learning based prediction model. We call our approach GLACIAL, which stands for "Granger and LeArning-based CausalIty Analysis for Longitudinal studies." GLACIAL treats individuals as independent samples and uses average prediction accuracy on hold-out individuals to test for effects of causal relationships. GLACIAL employs a multi-task neural network trained with input feature dropout to efficiently learn nonlinear dynamic relationships between a large number of variables, handle missing values, and probe causal links. Extensive experiments on synthetic and real data demonstrate the utility of GLACIAL and how it can outperform competitive baselines.

Viaarxiv icon