Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xinran Wu

Monocular Semantic Scene Completion via Masked Recurrent Networks

Jul 23, 2025

Xuzhi Wang, Xinran Wu, Song Wang, Lingdong Kong, Ziping Zhao

Abstract:Monocular Semantic Scene Completion (MSSC) aims to predict the voxel-wise occupancy and semantic category from a single-view RGB image. Existing methods adopt a single-stage framework that aims to simultaneously achieve visible region segmentation and occluded region hallucination, while also being affected by inaccurate depth estimation. Such methods often achieve suboptimal performance, especially in complex scenes. We propose a novel two-stage framework that decomposes MSSC into coarse MSSC followed by the Masked Recurrent Network. Specifically, we propose the Masked Sparse Gated Recurrent Unit (MS-GRU) which concentrates on the occupied regions by the proposed mask updating mechanism, and a sparse GRU design is proposed to reduce the computation cost. Additionally, we propose the distance attention projection to reduce projection errors by assigning different attention scores according to the distance to the observed surface. Experimental results demonstrate that our proposed unified framework, MonoMRN, effectively supports both indoor and outdoor scenes and achieves state-of-the-art performance on the NYUv2 and SemanticKITTI datasets. Furthermore, we conduct robustness analysis under various disturbances, highlighting the role of the Masked Recurrent Network in enhancing the model's resilience to such challenges. The source code is publicly available.

* ICCV 2025; 15 pages, 10 figures, 6 tables; Code at https://github.com/alanWXZ/MonoMRN

Via

Access Paper or Ask Questions

An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

Jun 18, 2024

Qin Li, Yizhe Zhang, Yan Li, Jun Lyu, Meng Liu, Longyu Sun, Mengting Sun, Qirong Li, Wenyue Mao, Xinran Wu(+4 more)

Figure 1 for An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

Figure 2 for An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

Figure 3 for An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

Figure 4 for An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation

Abstract:The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM's performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potential for performance biases that could mirror those found in task-specific deep learning models like nnU-Net. In this paper, we explored the fairness dilemma concerning large segmentation foundation models. We prospectively curate a benchmark dataset of 3D MRI and CT scans of the organs including liver, kidney, spleen, lung and aorta from a total of 1056 healthy subjects with expert segmentations. Crucially, we document demographic details such as gender, age, and body mass index (BMI) for each subject to facilitate a nuanced fairness analysis. We test state-of-the-art foundation models for medical image segmentation, including the original SAM, medical SAM and SAT models, to evaluate segmentation efficacy across different demographic groups and identify disparities. Our comprehensive analysis, which accounts for various confounding factors, reveals significant fairness concerns within these foundational models. Moreover, our findings highlight not only disparities in overall segmentation metrics, such as the Dice Similarity Coefficient but also significant variations in the spatial distribution of segmentation errors, offering empirical evidence of the nuanced challenges in ensuring fairness in medical image segmentation.

* Accepted to MICCAI-2024

Via

Access Paper or Ask Questions

A Hierarchical Conditional Random Field-based Attention Mechanism Approach for Gastric Histopathology Image Classification

Feb 21, 2021

Yixin Li, Xinran Wu, Chen Li, Changhao Sun, Md Rahaman, Yudong Yao, Xiaoyan Li, Yong Zhang, Tao Jiang

Figure 1 for A Hierarchical Conditional Random Field-based Attention Mechanism Approach for Gastric Histopathology Image Classification

Figure 2 for A Hierarchical Conditional Random Field-based Attention Mechanism Approach for Gastric Histopathology Image Classification

Figure 3 for A Hierarchical Conditional Random Field-based Attention Mechanism Approach for Gastric Histopathology Image Classification

Figure 4 for A Hierarchical Conditional Random Field-based Attention Mechanism Approach for Gastric Histopathology Image Classification

Abstract:In the Gastric Histopathology Image Classification (GHIC) tasks, which is usually weakly supervised learning missions, there is inevitably redundant information in the images. Therefore, designing networks that can focus on effective distinguishing features has become a popular research topic. In this paper, to accomplish the tasks of GHIC superiorly and to assist pathologists in clinical diagnosis, an intelligent Hierarchical Conditional Random Field based Attention Mechanism (HCRF-AM) model is proposed. The HCRF-AM model consists of an Attention Mechanism (AM) module and an Image Classification (IC) module. In the AM module, an HCRF model is built to extract attention regions. In the IC module, a Convolutional Neural Network (CNN) model is trained with the attention regions selected and then an algorithm called Classification Probability-based Ensemble Learning is applied to obtain the image-level results from patch-level output of the CNN. In the experiment, a classification specificity of 96.67% is achieved on a gastric histopathology dataset with 700 images. Our HCRF-AM model demonstrates high classification performance and shows its effectiveness and future potential in the GHIC field.

Via

Access Paper or Ask Questions