Abstract:Critical retained foreign objects (RFOs) on intraoperative chest radiographs are rare but high-risk events. Their scarcity limits robust automated detection model training and generalization. We introduce SurgRFO, a two-stage synthesis framework for generating realistic RFO-present intraoperative chest X-rays. In Stage 1, a Roentgen chest X-ray foundation model is fine-tuned on surgical-domain images to generate realistic RFO-free backgrounds that preserve anatomy, indwelling lines and tubes, and intraoperative imaging characteristics. In Stage 2, a lightweight generator trained on localized RFO patches from limited positive cases synthesizes diverse RFO instances, which are composited onto generated backgrounds using conditional Poisson fusion to improve photometric consistency. We evaluate SurgRFO through (i) a blinded clinician study assessing realism and clinical plausibility, and (ii) downstream detection experiments in which synthesized data are used to augment Faster R-CNN, YOLOv8, and RetinaNet. SurgRFO consistently improves sensitivity at low false-positive-per-image (FPPI) operating points on internal and external test sets. Clinician ratings indicate that the synthesized images achieve realism comparable to real intraoperative images. Ablation analyses further examine fusion strategies and synthesis scale. Ethical safeguards for synthetic surgical data are also discussed.
Abstract:Remote estimation of vital signs enables health monitoring for situations in which contact-based devices are either not available, too intrusive, or too expensive. In this paper, we present a modular, interpretable pipeline for pulse signal estimation from video of the face that achieves state-of-the-art results on publicly available datasets.Our imaging photoplethysmography (iPPG) system consists of three modules: face and landmark detection, time-series extraction, and pulse signal/pulse rate estimation. Unlike many deep learning methods that make use of a single black-box model that maps directly from input video to output signal or heart rate, our modular approach enables each of the three parts of the pipeline to be interpreted individually. The pulse signal estimation module, which we call TURNIP (Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography), allows the system to faithfully reconstruct the underlying pulse signal waveform and uses it to measure heart rate and pulse rate variability metrics, even in the presence of motion. When parts of the face are occluded due to extreme head poses, our system explicitly detects such "self-occluded" regions and maintains estimation robustness despite the missing information. Our algorithm provides reliable heart rate estimates without the need for specialized sensors or contact with the skin, outperforming previous iPPG methods on both color (RGB) and near-infrared (NIR) datasets.




Abstract:The classic metaphyseal lesion (CML) is a distinct injury that is highly specific for infant abuse. It commonly occurs in the distal tibia. To aid radiologists detect these subtle fractures, we need to develop a model that can flag abnormal distal tibial radiographs (i.e. those with CMLs). Unfortunately, the development of such a model requires a large and diverse training database, which is often not available. To address this limitation, we propose a novel generative model for data augmentation. Unlike previous models that fail to generate data that span the diverse radiographic appearance of the distal tibial CML, our proposed masked conditional diffusion model (MaC-DM) not only generates realistic-appearing and wide-ranging synthetic images of the distal tibial radiographs with and without CMLs, it also generates their associated segmentation labels. To achieve these tasks, MaC-DM combines the weighted segmentation masks of the tibias and the CML fracture sites as additional conditions for classifier guidance. The augmented images from our model improved the performances of ResNet-34 in classifying normal radiographs and those with CMLs. Further, the augmented images and their associated segmentation masks enhanced the performance of the U-Net in labeling areas of the CMLs on distal tibial radiographs.