Brain tumor image segmentation is a challenging research topic in which deep-learning models have presented the best results. However, the traditional way of training those models from many pre-annotated images leaves several unanswered questions. Hence methodologies, such as Feature Learning from Image Markers (FLIM), have involved an expert in the learning loop to reduce human effort in data annotation and build models sufficiently deep for a given problem. FLIM has been successfully used to create encoders, estimating the filters of all convolutional layers from patches centered at marker voxels. In this work, we present Multi-Step (MS) FLIM - a user-assisted approach to estimating and selecting the most relevant filters from multiple FLIM executions. MS-FLIM is used only for the first convolutional layer, and the results already indicate improvement over FLIM. For evaluation, we build a simple U-shaped encoder-decoder network, named sU-Net, for glioblastoma segmentation using T1Gd and FLAIR MRI scans, varying the encoder's training method, using FLIM, MS-FLIM, and backpropagation algorithm. Also, we compared these sU-Nets with two State-Of-The-Art (SOTA) deep-learning models using two datasets. The results show that the sU-Net based on MS-FLIM outperforms the other training methods and achieves effectiveness within the standard deviations of the SOTA models.
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.