Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Huiyu Zhou

Physically-Grounded Manifold Projection Model for Generalizable Metal Artifact Reduction in Dental CBCT

Jan 01, 2026

Zhi Li, Yaqi Wang, Bingtao Ma, Yifan Zhang, Huiyu Zhou, Shuai Wang

Abstract:Metal artifacts in Dental CBCT severely obscure anatomical structures, hindering diagnosis. Current deep learning for Metal Artifact Reduction (MAR) faces limitations: supervised methods suffer from spectral blurring due to "regression-to-the-mean", while unsupervised ones risk structural hallucinations. Denoising Diffusion Models (DDPMs) offer realism but rely on slow, stochastic iterative sampling, unsuitable for clinical use. To resolve this, we propose the Physically-Grounded Manifold Projection (PGMP) framework. First, our Anatomically-Adaptive Physics Simulation (AAPS) pipeline synthesizes high-fidelity training pairs via Monte Carlo spectral modeling and patient-specific digital twins, bridging the synthetic-to-real gap. Second, our DMP-Former adapts the Direct x-Prediction paradigm, reformulating restoration as a deterministic manifold projection to recover clean anatomy in a single forward pass, eliminating stochastic sampling. Finally, a Semantic-Structural Alignment (SSA) module anchors the solution using priors from medical foundation models (MedDINOv3), ensuring clinical plausibility. Experiments on synthetic and multi-center clinical datasets show PGMP outperforms state-of-the-art methods on unseen anatomy, setting new benchmarks in efficiency and diagnostic reliability. Code and data: https://github.com/ricoleehduu/PGMP.

* This manuscript has been submitted to Medical Image Analysis for peer review

Via

Access Paper or Ask Questions

Physically-Grounded Manifold Projection with Foundation Priors for Metal Artifact Reduction in Dental CBCT

Dec 30, 2025

Zhi Li, Yaqi Wang, Bingtao Ma, Yifan Zhang, Huiyu Zhou, Shuai Wang

* This manuscript has been submitted to Medical Image Analysis for peer review

Via

Access Paper or Ask Questions

PFed-Signal: An ADR Prediction Model based on Federated Learning

Dec 29, 2025

Tao Li, Peilin Li, Kui Lu, Yilei Wang, Junliang Shang, Guangshun Li, Huiyu Zhou

Abstract:The adverse drug reactions (ADRs) predicted based on the biased records in FAERS (U.S. Food and Drug Administration Adverse Event Reporting System) may mislead diagnosis online. Generally, such problems are solved by optimizing reporting odds ratio (ROR) or proportional reporting ratio (PRR). However, these methods that rely on statistical methods cannot eliminate the biased data, leading to inaccurate signal prediction. In this paper, we propose PFed-signal, a federated learning-based signal prediction model of ADR, which utilizes the Euclidean distance to eliminate the biased data from FAERS, thereby improving the accuracy of ADR prediction. Specifically, we first propose Pfed-Split, a method to split the original dataset into a split dataset based on ADR. Then we propose ADR-signal, an ADR prediction model, including a biased data identification method based on federated learning and an ADR prediction model based on Transformer. The former identifies the biased data according to the Euclidean distance and generates a clean dataset by deleting the biased data. The latter is an ADR prediction model based on Transformer trained on the clean data set. The results show that the ROR and PRR on the clean dataset are better than those of the traditional methods. Furthermore, the accuracy rate, F1 score, recall rate and AUC of PFed-Signal are 0.887, 0.890, 0.913 and 0.957 respectively, which are higher than the baselines.

* IEEE International Conference on Bioinformatics and Biomedicine

Via

Access Paper or Ask Questions

Multi-Label Transfer Learning in Non-Stationary Data Streams

Sep 09, 2025

Honghui Du, Leandro Minku, Aonghus Lawlor, Huiyu Zhou

Abstract:Label concepts in multi-label data streams often experience drift in non-stationary environments, either independently or in relation to other labels. Transferring knowledge between related labels can accelerate adaptation, yet research on multi-label transfer learning for data streams remains limited. To address this, we propose two novel transfer learning methods: BR-MARLENE leverages knowledge from different labels in both source and target streams for multi-label classification; BRPW-MARLENE builds on this by explicitly modelling and transferring pairwise label dependencies to enhance learning performance. Comprehensive experiments show that both methods outperform state-of-the-art multi-label stream approaches in non-stationary environments, demonstrating the effectiveness of inter-label knowledge transfer for improved predictive performance.

* Accepted at IEEE International Conference on Data Mining (ICDM) 2025

Via

Access Paper or Ask Questions

MARLINE: Multi-Source Mapping Transfer Learning for Non-Stationary Environments

Sep 09, 2025

Honghui Du, Leandro Minku, Huiyu Zhou

Abstract:Concept drift is a major problem in online learning due to its impact on the predictive performance of data stream mining systems. Recent studies have started exploring data streams from different sources as a strategy to tackle concept drift in a given target domain. These approaches make the assumption that at least one of the source models represents a concept similar to the target concept, which may not hold in many real-world scenarios. In this paper, we propose a novel approach called Multi-source mApping with tRansfer LearnIng for Non-stationary Environments (MARLINE). MARLINE can benefit from knowledge from multiple data sources in non-stationary environments even when source and target concepts do not match. This is achieved by projecting the target concept to the space of each source concept, enabling multiple source sub-classifiers to contribute towards the prediction of the target concept as part of an ensemble. Experiments on several synthetic and real-world datasets show that MARLINE was more accurate than several state-of-the-art data stream learning approaches.

* Published in the 2020 IEEE International Conference on Data Mining (ICDM)

Via

Access Paper or Ask Questions

SliceSemOcc: Vertical Slice Based Multimodal 3D Semantic Occupancy Representation

Sep 04, 2025

Han Huang, Han Sun, Ningzhong Liu, Huiyu Zhou, Jiaquan Shen

Abstract:Driven by autonomous driving's demands for precise 3D perception, 3D semantic occupancy prediction has become a pivotal research topic. Unlike bird's-eye-view (BEV) methods, which restrict scene representation to a 2D plane, occupancy prediction leverages a complete 3D voxel grid to model spatial structures in all dimensions, thereby capturing semantic variations along the vertical axis. However, most existing approaches overlook height-axis information when processing voxel features. And conventional SENet-style channel attention assigns uniform weight across all height layers, limiting their ability to emphasize features at different heights. To address these limitations, we propose SliceSemOcc, a novel vertical slice based multimodal framework for 3D semantic occupancy representation. Specifically, we extract voxel features along the height-axis using both global and local vertical slices. Then, a global local fusion module adaptively reconciles fine-grained spatial details with holistic contextual information. Furthermore, we propose the SEAttention3D module, which preserves height-wise resolution through average pooling and assigns dynamic channel attention weights to each height layer. Extensive experiments on nuScenes-SurroundOcc and nuScenes-OpenOccupancy datasets verify that our method significantly enhances mean IoU, achieving especially pronounced gains on most small-object categories. Detailed ablation studies further validate the effectiveness of the proposed SliceSemOcc framework.

* 14 pages, accepted by PRCV2025

Via

Access Paper or Ask Questions

SLENet: A Guidance-Enhanced Network for Underwater Camouflaged Object Detection

Sep 04, 2025

Xinxin Wang, Han Sun, Ningzhong Liu, Huiyu Zhou, Yinan Yao

Abstract:Underwater Camouflaged Object Detection (UCOD) aims to identify objects that blend seamlessly into underwater environments. This task is critically important to marine ecology. However, it remains largely underexplored and accurate identification is severely hindered by optical distortions, water turbidity, and the complex traits of marine organisms. To address these challenges, we introduce the UCOD task and present DeepCamo, a benchmark dataset designed for this domain. We also propose Semantic Localization and Enhancement Network (SLENet), a novel framework for UCOD. We first benchmark state-of-the-art COD models on DeepCamo to reveal key issues, upon which SLENet is built. In particular, we incorporate Gamma-Asymmetric Enhancement (GAE) module and a Localization Guidance Branch (LGB) to enhance multi-scale feature representation while generating a location map enriched with global semantic information. This map guides the Multi-Scale Supervised Decoder (MSSD) to produce more accurate predictions. Experiments on our DeepCamo dataset and three benchmark COD datasets confirm SLENet's superior performance over SOTA methods, and underscore its high generality for the broader COD task.

* 14pages, accepted by PRCV2025

Via

Access Paper or Ask Questions

Power allocation for cell-free MIMO integrated sensing and communication

May 26, 2025

Guoqing Xia, Pei Xiao, Qu Luo, Bing Ji, Yue Zhang, Huiyu Zhou

Figure 1 for Power allocation for cell-free MIMO integrated sensing and communication

Figure 2 for Power allocation for cell-free MIMO integrated sensing and communication

Figure 3 for Power allocation for cell-free MIMO integrated sensing and communication

Figure 4 for Power allocation for cell-free MIMO integrated sensing and communication

Abstract:In this paper, we investigate integrated sensing and communication (ISAC) in a cell-free (CF) multiple-input multiple-output (MIMO) network with single-antenna access points (APs), where each AP functions either as a transmitter for both sensing and communication or as a receiver for target-reflected signals. We derive closed-form Cramer-Rao lower bounds (CRLBs) for location and velocity estimation under arbitrary power allocation ratios, assuming the radar cross-section (RCS) is deterministic and unknown over the observation interval. A power allocation optimization problem is formulated to maximize the communication signal-to-interference-plus-noise ratio (SINR), subject to CRLB-based sensing constraints and per-transmitter power limits. To solve the resulting nonlinear and non-convex problem, we propose a penalty function and projection-based modified conjugate gradient algorithm with inexact line search (PP-MCG-ILS), and an alternative method based on a modified steepest descent approach (PP-MSD-ILS). Additionally, for power minimization in pure sensing scenarios, we introduce a penalty function-based normalized conjugate gradient algorithm (P-NCG-ILS). We analyze the convergence behavior and qualitatively compare the computational complexity of the proposed algorithms. Simulation results confirm the accuracy of the derived CRLBs and demonstrate the effectiveness of the proposed power allocation strategies in enhancing both sensing and overall ISAC performance.

* 14 pages, 11 figures

Via

Access Paper or Ask Questions

BOTM: Echocardiography Segmentation via Bi-directional Optimal Token Matching

May 23, 2025

Zhihua Liu, Lei Tong, Xilin He, Che Liu, Rossella Arcucci, Chen Jin, Huiyu Zhou

Abstract:Existed echocardiography segmentation methods often suffer from anatomical inconsistency challenge caused by shape variation, partial observation and region ambiguity with similar intensity across 2D echocardiographic sequences, resulting in false positive segmentation with anatomical defeated structures in challenging low signal-to-noise ratio conditions. To provide a strong anatomical guarantee across different echocardiographic frames, we propose a novel segmentation framework named BOTM (Bi-directional Optimal Token Matching) that performs echocardiography segmentation and optimal anatomy transportation simultaneously. Given paired echocardiographic images, BOTM learns to match two sets of discrete image tokens by finding optimal correspondences from a novel anatomical transportation perspective. We further extend the token matching into a bi-directional cross-transport attention proxy to regulate the preserved anatomical consistency within the cardiac cyclic deformation in temporal domain. Extensive experimental results show that BOTM can generate stable and accurate segmentation outcomes (e.g. -1.917 HD on CAMUS2H LV, +1.9% Dice on TED), and provide a better matching interpretation with anatomical consistency guarantee.

Via

Access Paper or Ask Questions

Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation

May 23, 2025

Zhihua Liu, Amrutha Saseendran, Lei Tong, Xilin He, Fariba Yousefi, Nikolay Burlutskiy, Dino Oglic, Tom Diethe, Philip Teare, Huiyu Zhou(+1 more)

Abstract:Open-set image segmentation poses a significant challenge because existing methods often demand extensive training or fine-tuning and generally struggle to segment unified objects consistently across diverse text reference expressions. Motivated by this, we propose Segment Anyword, a novel training-free visual concept prompt learning approach for open-set language grounded segmentation that relies on token-level cross-attention maps from a frozen diffusion model to produce segmentation surrogates or mask prompts, which are then refined into targeted object masks. Initial prompts typically lack coherence and consistency as the complexity of the image-text increases, resulting in suboptimal mask fragments. To tackle this issue, we further introduce a novel linguistic-guided visual prompt regularization that binds and clusters visual prompts based on sentence dependency and syntactic structural information, enabling the extraction of robust, noise-tolerant mask prompts, and significant improvements in segmentation accuracy. The proposed approach is effective, generalizes across different open-set segmentation tasks, and achieves state-of-the-art results of 52.5 (+6.8 relative) mIoU on Pascal Context 59, 67.73 (+25.73 relative) cIoU on gRefCOCO, and 67.4 (+1.1 relative to fine-tuned methods) mIoU on GranDf, which is the most complex open-set grounded segmentation task in the field.

Via

Access Paper or Ask Questions