Abstract:Identifying and counting blood components such as red blood cells, various types of white blood cells, and platelets is a critical task for healthcare practitioners. Deep learning approaches, particularly convolutional neural networks (CNNs) using supervised learning strategies, have shown considerable success for such tasks. However, CNN based architectures such as U-Net, often struggles to accurately identify platelets due to their sizes and high variability of features. To address these challenges, researchers have commonly employed strategies such as class weighted loss functions, which have demonstrated some success. However, this does not address the more significant challenge of platelet variability in size and tendency to form aggregates and associations with other blood components. In this study, we explored an alternative approach by investigating the role of convolutional kernels in mitigating these issues. We also assigned separate classes to singular platelets and platelet aggregates and performed semantic segmentation using various U-Net architectures for identifying platelets. We then evaluated and compared two common methods (pixel area method and connected component analysis) for counting platelets and proposed an alternative approach specialized for single platelets and platelet aggregates. Our experiments provided results that showed significant improvements in the identification of platelets, highlighting the importance of optimizing convolutional operations and class designations. We show that the common practice of pixel area-based counting often over estimate platelet counts, whereas the proposed method presented in this work offers significant improvements. We discuss in detail about these methods from segmentation masks.
Abstract:The hematology analytics used for detection and classification of small blood components is a significant challenge. In particular, when objects exists as small pixel-sized entities in a large context of similar objects. Deep learning approaches using supervised models with pre-trained weights, such as residual networks and vision transformers have demonstrated success for many applications. Unfortunately, when applied to images outside the domain of learned representations, these methods often result with less than acceptable performance. A strategy to overcome this can be achieved by using self-supervised models, where representations are learned and weights are then applied for downstream applications. Recently, masked autoencoders have proven to be effective to obtain representations that captures global context information. By masking regions of an image and having the model learn to reconstruct both the masked and non-masked regions, weights can be used for various applications. However, if the sizes of the objects in images are less than the size of the mask, the global context information is lost, making it almost impossible to reconstruct the image. In this study, we investigated the effect of mask ratios and patch sizes for blood components using a MAE to obtain learned ViT encoder representations. We then applied the encoder weights to train a U-Net Transformer for semantic segmentation to obtain both local and global contextual information. Our experimental results demonstrates that both smaller mask ratios and patch sizes improve the reconstruction of images using a MAE. We also show the results of semantic segmentation with and without pre-trained weights, where smaller-sized blood components benefited with pre-training. Overall, our proposed method offers an efficient and effective strategy for the segmentation and classification of small objects.