Positron emission tomography (PET) combined with computed tomography (CT) imaging is routinely used in cancer diagnosis and prognosis by providing complementary information. Automatically segmenting tumors in PET/CT images can significantly improve examination efficiency. Traditional multi-modal segmentation solutions mainly rely on concatenation operations for modality fusion, which fail to effectively model the non-linear dependencies between PET and CT modalities. Recent studies have investigated various approaches to optimize the fusion of modality-specific features for enhancing joint representations. However, modality-specific encoders used in these methods operate independently, inadequately leveraging the synergistic relationships inherent in PET and CT modalities, for example, the complementarity between semantics and structure. To address these issues, we propose a Hierarchical Adaptive Interaction and Weighting Network termed H2ASeg to explore the intrinsic cross-modal correlations and transfer potential complementary information. Specifically, we design a Modality-Cooperative Spatial Attention (MCSA) module that performs intra- and inter-modal interactions globally and locally. Additionally, a Target-Aware Modality Weighting (TAMW) module is developed to highlight tumor-related features within multi-modal features, thereby refining tumor segmentation. By embedding these modules across different layers, H2ASeg can hierarchically model cross-modal correlations, enabling a nuanced understanding of both semantic and structural tumor features. Extensive experiments demonstrate the superiority of H2ASeg, outperforming state-of-the-art methods on AutoPet-II and Hecktor2022 benchmarks. The code is released at https://github.com/JinPLu/H2ASeg.
Federated Learning (FL) enables collaborative model training among medical centers without sharing private data. However, traditional FL risks on server failures and suboptimal performance on local data due to the nature of centralized model aggregation. To address these issues, we present Gossip Mutual Learning (GML), a decentralized framework that uses Gossip Protocol for direct peer-to-peer communication. In addition, GML encourages each site to optimize its local model through mutual learning to account for data variations among different sites. For the task of tumor segmentation using 146 cases from four clinical sites in BraTS 2021 dataset, we demonstrated GML outperformed local models and achieved similar performance as FedAvg with only 25% communication overhead.
Federated learning (FL) has emerged as a promising strategy for collaboratively training complicated machine learning models from different medical centers without the need of data sharing. However, the traditional FL relies on a central server to orchestrate the global model training among clients. This makes it vulnerable to the failure of the model server. Meanwhile, the model trained based on the global data property may not yield the best performance on the local data of a particular site due to the variations of data characteristics among them. To address these limitations, we proposed Gossip Mutual Learning(GML), a decentralized collaborative learning framework that employs Gossip Protocol for direct peer-to-peer communication and encourages each site to optimize its local model by leveraging useful information from peers through mutual learning. On the task of tumor segmentation on PET/CT images using HECKTOR21 dataset with 223 cases from five clinical sites, we demonstrated GML could improve tumor segmentation performance in terms of Dice Similarity Coefficient (DSC) by 3.2%, 4.6% and 10.4% on site-specific testing cases as compared to three baseline methods: pooled training, FedAvg and individual training, respectively. We also showed GML has comparable generalization performance as pooled training and FedAvg when applying them on 78 cases from two out-of-sample sites where no case was used for model training. In our experimental setup, GML showcased a sixfold decrease in communication overhead compared to FedAvg, requiring only 16.67% of the total communication overhead.