Abstract:Synthesizing high quality CT images remains a signifi-cant challenge due to the limited availability of annotat-ed data and the complex nature of CT imaging. In this work, we present PRO, a novel framework that, to the best of our knowledge, is the first to perform CT image synthesis in the projection domain using latent diffusion models. Unlike previous approaches that operate in the image domain, PRO learns rich structural representa-tions from raw projection data and leverages anatomi-cal text prompts for controllable synthesis. This projec-tion domain strategy enables more faithful modeling of underlying imaging physics and anatomical structures. Moreover, PRO functions as a foundation model, capa-ble of generalizing across diverse downstream tasks by adjusting its generative behavior via prompt inputs. Experimental results demonstrated that incorporating our synthesized data significantly improves perfor-mance across multiple downstream tasks, including low-dose and sparse-view reconstruction, even with limited training data. These findings underscore the versatility and scalability of PRO in data generation for various CT applications. These results highlight the potential of projection domain synthesis as a powerful tool for data augmentation and robust CT imaging. Our source code is publicly available at: https://github.com/yqx7150/PRO.
Abstract:Object recognition from images means to automatically find object(s) of interest and to return their category and location information. Benefiting from research on deep learning, like convolutional neural networks~(CNNs) and generative adversarial networks, the performance in this field has been improved significantly, especially when training and test data are drawn from similar distributions. However, mismatching distributions, i.e., domain shifts, lead to a significant performance drop. In this paper, we build domain-invariant detectors by learning domain classifiers via adversarial training. Based on the previous works that align image and instance level features, we mitigate the domain shift further by introducing a domain adaptation component at the region level within Faster \mbox{R-CNN}. We embed a domain classification network in the region proposal network~(RPN) using adversarial learning. The RPN can now generate accurate region proposals in different domains by effectively aligning the features between them. To mitigate the unstable convergence during the adversarial learning, we introduce a balanced domain classifier as well as a network learning rate adjustment strategy. We conduct comprehensive experiments using four standard datasets. The results demonstrate the effectiveness and robustness of our object detection approach in domain shift scenarios.