Abstract:Coronary artery calcification (CAC) is a strong predictor of cardiovascular risk but remains underutilized in clinical routine thoracic imaging due to the need for dedicated imaging protocols and manual annotation. We present DeepCAC2, a publicly available dataset containing automated CAC segmentations, coronary artery calcium scores, and derived risk categories generated from low-dose chest CT scans of the National Lung Screening Trial (NLST). Using a fully automated deep learning pipeline trained on expert-annotated cardiac CT data, we processed 127,776 CT scans from 26,228 individuals and generated standardized CAC segmentations and risk estimates for each acquisition. We already provide a public dashboard as a simple tool to visually inspect a random subset of 200 NLST patients of the dataset. The dataset will be released with DICOM-compatible segmentation objects and structured metadata to support reproducible downstream analysis. The deep learning pipeline will be made publicly available as a DICOM-compatible MHub.ai container. DeepCAC2 provides a transparent, large-scale, public, fully reproducible resource for research in cardiovascular risk assessment, opportunistic screening, and imaging biomarker development.
Abstract:Chest radiographs (CXRs) are among the most common tests in medicine. Automated image interpretation may reduce radiologists\' workload and expand access to diagnostic expertise. Deep learning multi-task and foundation models have shown strong performance for CXR interpretation but are vulnerable to shortcut learning, where models rely on spurious and off-target correlations rather than clinically relevant features to make decisions. We introduce RoentMod, a counterfactual image editing framework that generates anatomically realistic CXRs with user-specified, synthetic pathology while preserving unrelated anatomical features of the original scan. RoentMod combines an open-source medical image generator (RoentGen) with an image-to-image modification model without requiring retraining. In reader studies with board-certified radiologists and radiology residents, RoentMod-produced images appeared realistic in 93\% of cases, correctly incorporated the specified finding in 89-99\% of cases, and preserved native anatomy comparable to real follow-up CXRs. Using RoentMod, we demonstrate that state-of-the-art multi-task and foundation models frequently exploit off-target pathology as shortcuts, limiting their specificity. Incorporating RoentMod-generated counterfactual images during training mitigated this vulnerability, improving model discrimination across multiple pathologies by 3-19\% AUC in internal validation and by 1-11\% for 5 out of 6 tested pathologies in external testing. These findings establish RoentMod as a broadly applicable tool for probing and correcting shortcut learning in medical AI. By enabling controlled counterfactual interventions, RoentMod enhances the robustness and interpretability of CXR interpretation models and provides a generalizable strategy for improving foundation models in medical imaging.