Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhuoyi Fang

Rich-U-Net: A medical image segmentation model for fusing spatial depth features and capturing minute structural details

Mar 31, 2026

Zhuoyi Fang, Kexuan Shi, Jiajia Liu, Qiang Han

Abstract:Medical image segmentation is of great significance in analysis of illness. The use of deep neural networks in medical image segmentation can help doctors extract regions of interest from complex medical images, thereby improving diagnostic accuracy and enabling better assessment of the condition to formulate treatment plans. However, most current medical image segmentation methods underperform in accurately extracting spatial information from medical images and mining potential complex structures and variations. In this article, we introduce the Rich-U-Net model, which effectively integrates both spatial and depth features. This fusion enhances the model's capability to detect fine structures and intricate details within complex medical images. Our multi-level and multi-dimensional feature fusion and optimization strategies enable our model to achieve fine structure localization and accurate segmentation results in medical image segmentation. Experiments on the ISIC2018, BUSI, GLAS, and CVC datasets show that Rich-U-Net surpasses other state-of-the-art models in Dice, IoU, and HD95 metrics.

Via

Access Paper or Ask Questions

MD-RWKV-UNet: Scale-Aware Anatomical Encoding with Cross-Stage Fusion for Multi-Organ Segmentation

Mar 28, 2026

Zhuoyi Fang

Abstract:Multi-organ segmentation in medical imaging remains challenging due to large anatomical variability, complex inter-organ dependencies, and diverse organ scales and shapes. Conventional encoder-decoder architectures often struggle to capture both fine-grained local details and long-range context, which are crucial for accurate delineation - especially for small or deformable organs. To address these limitations, we propose MD-RWKV-UNet, a dynamic encoder network that enables scale-aware representation and spatially adaptive context modeling. At its core is the MD-RWKV block, a dual-path module that integrates deformable spatial shifts with the Receptance Weighted Key Value mechanism, allowing the receptive field to adapt dynamically to local structural cues. We further incorporate Selective Kernel Attention to enable adaptive selection of convolutional kernels with varying receptive fields, enhancing multi-scale interaction and improving robustness to organ size and shape variation. In parallel, a cross-stage dual-attention fusion strategy aggregates multi-level features across the encoder, preserving low-level structure while enhancing semantic consistency. Unlike methods that stack static convolutions or rely heavily on global attention, our approach provides a lightweight yet expressive solution for dynamic organ modeling. Experiments on Synapse and ACDC demonstrate state-of-the-art performance, particularly in boundary precision and small-organ segmentation.

Via

Access Paper or Ask Questions

DeepBayesFlow: A Bayesian Structured Variational Framework for Generalizable Prostate Segmentation via Expressive Posteriors and SDE-Girsanov Uncertainty Modeling

Mar 28, 2026

Zhuoyi Fang

Abstract:Automatic prostate MRI segmentation faces persistent challenges due to inter-patient anatomical variability, blurred tissue boundaries, and distribution shifts arising from diverse imaging protocols. To address these issues, we propose DeepBayesFlow, a novel Bayesian segmentation framework designed to enhance both robustness and generalization across clinical domains. DeepBayesFlow introduces three key innovations: a learnable NF-Posterior module based on normalizing flows that models complex, data-adaptive latent distributions; a NCVI inference mechanism that removes conjugacy constraints to enable flexible posterior learning in high-dimensional settings; and a SDE-Girsanov module that refines latent representations via time-continuous diffusion and formal measure transformation, injecting temporal coherence and physically grounded uncertainty into the inference process. Together, these components allow DeepBayesFlow to capture domain-invariant structural priors while dynamically adapting to domain-specific variations, achieving accurate and interpretable segmentation across heterogeneous prostate MRI datasets.

Via

Access Paper or Ask Questions

When Mamba Meets xLSTM: An Efficient and Precise Method with the XLSTM-VMUNet Model for Skin lesion Segmentation

Nov 14, 2024

Zhuoyi Fang, KeXuan Shi, Qiang Han

Figure 1 for When Mamba Meets xLSTM: An Efficient and Precise Method with the XLSTM-VMUNet Model for Skin lesion Segmentation

Figure 2 for When Mamba Meets xLSTM: An Efficient and Precise Method with the XLSTM-VMUNet Model for Skin lesion Segmentation

Figure 3 for When Mamba Meets xLSTM: An Efficient and Precise Method with the XLSTM-VMUNet Model for Skin lesion Segmentation

Figure 4 for When Mamba Meets xLSTM: An Efficient and Precise Method with the XLSTM-VMUNet Model for Skin lesion Segmentation

Abstract:Automatic melanoma segmentation is essential for early skin cancer detection, yet challenges arise from the heterogeneity of melanoma, as well as interfering factors like blurred boundaries, low contrast, and imaging artifacts. While numerous algorithms have been developed to address these issues, previous approaches have often overlooked the need to jointly capture spatial and sequential features within dermatological images. This limitation hampers segmentation accuracy, especially in cases with indistinct borders or structurally similar lesions. Additionally, previous models lacked both a global receptive field and high computational efficiency. In this work, we present the XLSTM-VMUNet Model, which jointly capture spatial and sequential features within derma-tological images successfully. XLSTM-VMUNet can not only specialize in extracting spatial features from images, focusing on the structural characteristics of skin lesions, but also enhance contextual understanding, allowing more effective handling of complex medical image structures. Experiment results on the ISIC2018 dataset demonstrate that XLSTM-VMUNet outperforms VMUNet by 1.25% on DSC and 2.07% on IoU, with faster convergence and consistently high segmentation perfor-mance. Our code of XLSTM-VMUNet is available at https://github.com/MrFang/xLSTM-VMUNet.

Via

Access Paper or Ask Questions