Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jixiang Chen

Cross-view Generalized Diffusion Model for Sparse-view CT Reconstruction

Aug 14, 2025

Jixiang Chen, Yiqun Lin, Yi Qin, Hualiang Wang, Xiaomeng Li

Abstract:Sparse-view computed tomography (CT) reduces radiation exposure by subsampling projection views, but conventional reconstruction methods produce severe streak artifacts with undersampled data. While deep-learning-based methods enable single-step artifact suppression, they often produce over-smoothed results under significant sparsity. Though diffusion models improve reconstruction via iterative refinement and generative priors, they require hundreds of sampling steps and struggle with stability in highly sparse regimes. To tackle these concerns, we present the Cross-view Generalized Diffusion Model (CvG-Diff), which reformulates sparse-view CT reconstruction as a generalized diffusion process. Unlike existing diffusion approaches that rely on stochastic Gaussian degradation, CvG-Diff explicitly models image-domain artifacts caused by angular subsampling as a deterministic degradation operator, leveraging correlations across sparse-view CT at different sample rates. To address the inherent artifact propagation and inefficiency of sequential sampling in generalized diffusion model, we introduce two innovations: Error-Propagating Composite Training (EPCT), which facilitates identifying error-prone regions and suppresses propagated artifacts, and Semantic-Prioritized Dual-Phase Sampling (SPDPS), an adaptive strategy that prioritizes semantic correctness before detail refinement. Together, these innovations enable CvG-Diff to achieve high-quality reconstructions with minimal iterations, achieving 38.34 dB PSNR and 0.9518 SSIM for 18-view CT using only \textbf{10} steps on AAPM-LDCT dataset. Extensive experiments demonstrate the superiority of CvG-Diff over state-of-the-art sparse-view CT reconstruction methods. The code is available at https://github.com/xmed-lab/CvG-Diff.

* MICCAI 2025 Spotlight

Via

Access Paper or Ask Questions

DeepSparse: A Foundation Model for Sparse-View CBCT Reconstruction

May 05, 2025

Yiqun Lin, Hualiang Wang, Jixiang Chen, Jiewen Yang, Jiarong Guo, Xiaomeng Li

Abstract:Cone-beam computed tomography (CBCT) is a critical 3D imaging technology in the medical field, while the high radiation exposure required for high-quality imaging raises significant concerns, particularly for vulnerable populations. Sparse-view reconstruction reduces radiation by using fewer X-ray projections while maintaining image quality, yet existing methods face challenges such as high computational demands and poor generalizability to different datasets. To overcome these limitations, we propose DeepSparse, the first foundation model for sparse-view CBCT reconstruction, featuring DiCE (Dual-Dimensional Cross-Scale Embedding), a novel network that integrates multi-view 2D features and multi-scale 3D features. Additionally, we introduce the HyViP (Hybrid View Sampling Pretraining) framework, which pretrains the model on large datasets with both sparse-view and dense-view projections, and a two-step finetuning strategy to adapt and refine the model for new datasets. Extensive experiments and ablation studies demonstrate that our proposed DeepSparse achieves superior reconstruction quality compared to state-of-the-art methods, paving the way for safer and more efficient CBCT imaging.

Via

Access Paper or Ask Questions

Robust 6DoF Pose Tracking Considering Contour and Interior Correspondence Uncertainty for AR Assembly Guidance

Feb 17, 2025

Jixiang Chen, Jing Chen, Kai Liu, Haochen Chang, Shanfeng Fu, Jian Yang

Abstract:Augmented reality assembly guidance is essential for intelligent manufacturing and medical applications, requiring continuous measurement of the 6DoF poses of manipulated objects. Although current tracking methods have made significant advancements in accuracy and efficiency, they still face challenges in robustness when dealing with cluttered backgrounds, rotationally symmetric objects, and noisy sequences. In this paper, we first propose a robust contour-based pose tracking method that addresses error-prone contour correspondences and improves noise tolerance. It utilizes a fan-shaped search strategy to refine correspondences and models local contour shape and noise uncertainty as mixed probability distribution, resulting in a highly robust contour energy function. Secondly, we introduce a CPU-only strategy to better track rotationally symmetric objects and assist the contour-based method in overcoming local minima by exploring sparse interior correspondences. This is achieved by pre-sampling interior points from sparse viewpoint templates offline and using the DIS optical flow algorithm to compute their correspondences during tracking. Finally, we formulate a unified energy function to fuse contour and interior information, which is solvable using a re-weighted least squares algorithm. Experiments on public datasets and real scenarios demonstrate that our method significantly outperforms state-of-the-art monocular tracking methods and can achieve more than 100 FPS using only a CPU.

* Submitted to IEEE Transactions on Instrumentation and Measurement

Via

Access Paper or Ask Questions

Spatial-Division Augmented Occupancy Field for Bone Shape Reconstruction from Biplanar X-Rays

Jul 22, 2024

Jixiang Chen, Yiqun Lin, Haoran Sun, Xiaomeng Li

Figure 1 for Spatial-Division Augmented Occupancy Field for Bone Shape Reconstruction from Biplanar X-Rays

Figure 2 for Spatial-Division Augmented Occupancy Field for Bone Shape Reconstruction from Biplanar X-Rays

Figure 3 for Spatial-Division Augmented Occupancy Field for Bone Shape Reconstruction from Biplanar X-Rays

Figure 4 for Spatial-Division Augmented Occupancy Field for Bone Shape Reconstruction from Biplanar X-Rays

Abstract:Retrieving 3D bone anatomy from biplanar X-ray images is crucial since it can significantly reduce radiation exposure compared to traditional CT-based methods. Although various deep learning models have been proposed to address this complex task, they suffer from two limitations: 1) They employ voxel representation for bone shape and exploit 3D convolutional layers to capture anatomy prior, which are memory-intensive and limit the reconstruction resolution. 2) They overlook the prevalent occlusion effect within X-ray images and directly extract features using a simple loss, which struggles to fully exploit complex X-ray information. To tackle these concerns, we present Spatial-division Augmented Occupancy Field~(SdAOF). SdAOF adopts the continuous occupancy field for shape representation, reformulating the reconstruction problem as a per-point occupancy value prediction task. Its implicit and continuous nature enables memory-efficient training and fine-scale surface reconstruction at different resolutions during the inference. Moreover, we propose a novel spatial-division augmented distillation strategy to provide feature-level guidance for capturing the occlusion relationship. Extensive experiments on the pelvis reconstruction dataset show that SdAOF outperforms state-of-the-art methods and reconstructs fine-scale bone surfaces.The code is available at https://github.com/xmed-lab/SdAOF

* Accepted to MICCAI 2024. Project link: https://github.com/xmed-lab/SdAOF

Via

Access Paper or Ask Questions

Learning Unlabeled Clients Divergence via Anchor Model Aggregation for Federated Semi-supervised Learning

Jul 14, 2024

Marawan Elbatel, Hualiang Wang, Jixiang Chen, Hao Wang, Xiaomeng Li

Figure 1 for Learning Unlabeled Clients Divergence via Anchor Model Aggregation for Federated Semi-supervised Learning

Figure 2 for Learning Unlabeled Clients Divergence via Anchor Model Aggregation for Federated Semi-supervised Learning

Figure 3 for Learning Unlabeled Clients Divergence via Anchor Model Aggregation for Federated Semi-supervised Learning

Figure 4 for Learning Unlabeled Clients Divergence via Anchor Model Aggregation for Federated Semi-supervised Learning

Abstract:Federated semi-supervised learning (FedSemi) refers to scenarios where there may be clients with fully labeled data, clients with partially labeled, and even fully unlabeled clients while preserving data privacy. However, challenges arise from client drift due to undefined heterogeneous class distributions and erroneous pseudo-labels. Existing FedSemi methods typically fail to aggregate models from unlabeled clients due to their inherent unreliability, thus overlooking unique information from their heterogeneous data distribution, leading to sub-optimal results. In this paper, we enable unlabeled client aggregation through SemiAnAgg, a novel Semi-supervised Anchor-Based federated Aggregation. SemiAnAgg learns unlabeled client contributions via an anchor model, effectively harnessing their informative value. Our key idea is that by feeding local client data to the same global model and the same consistently initialized anchor model (i.e., random model), we can measure the importance of each unlabeled client accordingly. Extensive experiments demonstrate that SemiAnAgg achieves new state-of-the-art results on four widely used FedSemi benchmarks, leading to substantial performance improvements: a 9% increase in accuracy on CIFAR-100 and a 7.6% improvement in recall on the medical dataset ISIC-18, compared with prior state-of-the-art. Code is available at: https://github.com/xmed-lab/SemiAnAgg.

Via

Access Paper or Ask Questions

Learning 3D Gaussians for Extremely Sparse-View Cone-Beam CT Reconstruction

Jul 01, 2024

Yiqun Lin, Hualiang Wang, Jixiang Chen, Xiaomeng Li

Abstract:Cone-Beam Computed Tomography (CBCT) is an indispensable technique in medical imaging, yet the associated radiation exposure raises concerns in clinical practice. To mitigate these risks, sparse-view reconstruction has emerged as an essential research direction, aiming to reduce the radiation dose by utilizing fewer projections for CT reconstruction. Although implicit neural representations have been introduced for sparse-view CBCT reconstruction, existing methods primarily focus on local 2D features queried from sparse projections, which is insufficient to process the more complicated anatomical structures, such as the chest. To this end, we propose a novel reconstruction framework, namely DIF-Gaussian, which leverages 3D Gaussians to represent the feature distribution in the 3D space, offering additional 3D spatial information to facilitate the estimation of attenuation coefficients. Furthermore, we incorporate test-time optimization during inference to further improve the generalization capability of the model. We evaluate DIF-Gaussian on two public datasets, showing significantly superior reconstruction performance than previous state-of-the-art methods.

* Accepted to MICCAI 2024. Project link: https://github.com/xmed-lab/DIF-Gaussian

Via

Access Paper or Ask Questions

Wavelet-Decoupling Contrastive Enhancement Network for Fine-Grained Skeleton-Based Action Recognition

Feb 03, 2024

Haochen Chang, Jing Chen, Yilin Li, Jixiang Chen, Xiaofeng Zhang

Figure 1 for Wavelet-Decoupling Contrastive Enhancement Network for Fine-Grained Skeleton-Based Action Recognition

Figure 2 for Wavelet-Decoupling Contrastive Enhancement Network for Fine-Grained Skeleton-Based Action Recognition

Figure 3 for Wavelet-Decoupling Contrastive Enhancement Network for Fine-Grained Skeleton-Based Action Recognition

Figure 4 for Wavelet-Decoupling Contrastive Enhancement Network for Fine-Grained Skeleton-Based Action Recognition

Abstract:Skeleton-based action recognition has attracted much attention, benefiting from its succinctness and robustness. However, the minimal inter-class variation in similar action sequences often leads to confusion. The inherent spatiotemporal coupling characteristics make it challenging to mine the subtle differences in joint motion trajectories, which is critical for distinguishing confusing fine-grained actions. To alleviate this problem, we propose a Wavelet-Attention Decoupling (WAD) module that utilizes discrete wavelet transform to effectively disentangle salient and subtle motion features in the time-frequency domain. Then, the decoupling attention adaptively recalibrates their temporal responses. To further amplify the discrepancies in these subtle motion features, we propose a Fine-grained Contrastive Enhancement (FCE) module to enhance attention towards trajectory features by contrastive learning. Extensive experiments are conducted on the coarse-grained dataset NTU RGB+D and the fine-grained dataset FineGYM. Our methods perform competitively compared to state-of-the-art methods and can discriminate confusing fine-grained actions well.

* Accepted by ICASSP 2024

Via

Access Paper or Ask Questions

Dynamic Multi-objective Ensemble of Acquisition Functions in Batch Bayesian Optimization

Jun 22, 2022

Jixiang Chen, Fu Luo, Zhenkun Wang

Figure 1 for Dynamic Multi-objective Ensemble of Acquisition Functions in Batch Bayesian Optimization

Figure 2 for Dynamic Multi-objective Ensemble of Acquisition Functions in Batch Bayesian Optimization

Abstract:Bayesian optimization (BO) is a typical approach to solve expensive optimization problems. In each iteration of BO, a Gaussian process(GP) model is trained using the previously evaluated solutions; then next candidate solutions for expensive evaluation are recommended by maximizing a cheaply-evaluated acquisition function on the trained surrogate model. The acquisition function plays a crucial role in the optimization process. However, each acquisition function has its own strengths and weaknesses, and no single acquisition function can consistently outperform the others on all kinds of problems. To better leverage the advantages of different acquisition functions, we propose a new method for batch BO. In each iteration, three acquisition functions are dynamically selected from a set based on their current and historical performance to form a multi-objective optimization problem (MOP). Using an evolutionary multi-objective algorithm to optimize such a MOP, a set of non-dominated solutions can be obtained. To select batch candidate solutions, we rank these non-dominated solutions into several layers according to their relative performance on the three acquisition functions. The empirical results show that the proposed method is competitive with the state-of-the-art methods on different problems.

* 4 pages, GECCO 2022

Via

Access Paper or Ask Questions