Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zikang Xu

PDEAgent-Bench: A Multi-Metric, Multi-Library Benchmark for PDE Solver Generation

May 10, 2026

Zhen Hang, Yushan Yashengjiang, Junhui Li, Huanshuo Dong, Yang Wei, Zhezheng Hao, Jiangtao Ma, Songlin Bai, Haozhong Kai, Xihang Yue(+14 more)

Abstract:PDE-to-solver code generation aims to automatically synthesize executable numerical solvers from partial differential equation (PDE) specifications. This task requires not only understanding the mathematical structure of PDEs, but also selecting appropriate discretization schemes and solver configurations, and correctly implementing the resulting formulations in finite-element method (FEM) libraries. Existing code generation benchmarks mainly evaluate syntactic correctness, or success on predefined test cases. To our knowledge, there is currently no publicly available benchmark specifically for PDE-to-solver code generation, and general-purpose code benchmarks do not fully capture the unique challenges of numerical PDE solution, such as ensuring solver accuracy, efficiency, and compatibility with professional FEM libraries. We introduce PDEAgent-Bench, to the best of our knowledge, the first multi-metric, multi-library benchmark for PDE-to-solver code generation. PDEAgent-Bench contains 645 instances across 6 mathematical categories and 11 PDE families, with common FEM libraries for DOLFINx, Firedrake, and deal.II. Each instance provides an agent-facing problem specification, a reference solution on a prescribed evaluation grid, and case-specific accuracy and runtime targets. PDEAgent-Bench adopts a staged evaluation framework in which generated solvers must sequentially pass executability, numerical accuracy, and computational efficiency checks. Experiments with representative LLMs and code agents show that models can often produce runnable code, but their pass rate drops substantially once accuracy and efficiency requirements are enforced. These results indicate that current agents remain limited in producing numerically reliable and efficient PDE solvers, and that PDEAgent-Bench provides a reproducible testbed grounded in the practical requirements of numerical PDE solving.

Via

Access Paper or Ask Questions

EDU-Net: Retinal Pathological Fluid Segmentation in OCT Images with Multiscale Feature Fusion and Boundary Optimization

Apr 22, 2026

Zijun Lei, Zikang Xu, Liang Zhang, Ge Song, Hanyu Guo, Dan Cao, Yujia Zhou, Qianjin Feng

Abstract:Objective: Diabetic macular edema (DME) is the leading cause of severe visual impairment in patients with diabetes. Quantification of retinal fluid, particularly intraretinal fluid (IRF) and subretinal fluid (SRF), plays a critical role in the management of DME. Although optical coherence tomography (OCT) can be used for detection, the variable morphology of fluid accumulation and the blurred boundaries caused by noise interference still limit the accuracy of OCT's automatic segmentation. Methods: Retrospective model development and validation study. This study proposes a novel edge-guided dual-branch encoder-decoder network (EDU-Net) to achieve accurate and efficient automatic segmentation of OCT liquid lesions. The local feature extraction branch is based on the EfficientNet model, which precisely captures tiny lesions by leveraging its lightweight separable convolution and high-resolution feature preservation strategy. The global feature extraction branch is based on the large-kernel efficient convolution (LKEC) module and the downsampling layer design to enhance long-range dependencies and global semantics. EDU-Net applies a multi-category edge-guided attention module to fuse high-frequency boundary detail information to each resolution feature to optimize the boundary segmentation performance. Results: Extensive results on the in-house and public datasets demonstrate that EDU-Net achieves state-of-the-art DSC segmentation performance in terms of efficiency and robustness, especially in the segmentation of IRF lesions. Conclusions: EDU-Net integrates local details with global context and optimizes boundaries, achieving an improvement in the accuracy of automatic segmentation of retinal fluid.

Via

Access Paper or Ask Questions

DUCX: Decomposing Unfairness in Tool-Using Chest X-ray Agents

Feb 28, 2026

Zikang Xu, Ruinan Jin, Xiaoxiao Li

Abstract:Tool-using medical agents can improve chest X-ray question answering by orchestrating specialized vision and language modules, but this added pipeline complexity also creates new pathways for demographic bias beyond standalone models. We present ours (Decomposing Unfairness in Chest X-ray agents), a systematic audit of chest X-ray agents instantiated with MedRAX. To localize where disparities arise, we introduce a stage-wise fairness decomposition that separates end-to-end bias from three agent-specific sources: tool exposure bias (utility gaps conditioned on tool presence), tool transition bias (subgroup differences in tool-routing patterns), and model reasoning bias (subgroup differences in synthesis behaviors). Extensive experiments on tool-used based agentic frameworks across five driver backbones reveal that (i) demographic gaps persist in end-to-end performance, with equalized odds up to 20.79%, and the lowest fairness-utility tradeoff down to 28.65%, and (ii) intermediate behaviors, tool usage, transition patterns, and reasoning traces exhibit distinct subgroup disparities that are not predictable from end-to-end evaluation alone (e.g., conditioned on segmentation-tool availability, the subgroup utility gap reaches as high as 50%). Our findings underscore the need for process-level fairness auditing and debiasing to ensure the equitable deployment of clinical agentic systems. Code is available here: https://anonymous.4open.science/r/DUCK-E5FE/README.md

* 11 pages, 3 figures

Via

Access Paper or Ask Questions

MTFM: A Scalable and Alignment-free Foundation Model for Industrial Recommendation in Meituan

Feb 13, 2026

Xin Song, Zhilin Guan, Ruidong Han, Binghao Tang, Tianwen Chen, Bing Li, Zihao Li, Han Zhang, Fei Jiang, Qing Wang(+5 more)

Abstract:Industrial recommendation systems typically involve multiple scenarios, yet existing cross-domain (CDR) and multi-scenario (MSR) methods often require prohibitive resources and strict input alignment, limiting their extensibility. We propose MTFM (Meituan Foundation Model for Recommendation), a transformer-based framework that addresses these challenges. Instead of pre-aligning inputs, MTFM transforms cross-domain data into heterogeneous tokens, capturing multi-scenario knowledge in an alignment-free manner. To enhance efficiency, we first introduce a multi-scenario user-level sample aggregation that significantly enhances training throughput by reducing the total number of instances. We further integrate Grouped-Query Attention and a customized Hybrid Target Attention to minimize memory usage and computational complexity. Furthermore, we implement various system-level optimizations, such as kernel fusion and the elimination of CPU-GPU blocking, to further enhance both training and inference throughput. Offline and online experiments validate the effectiveness of MTFM, demonstrating that significant performance gains are achieved by scaling both model capacity and multi-scenario training data.

Via

Access Paper or Ask Questions

FAIR-ESI: Feature Adaptive Importance Refinement for Electrophysiological Source Imaging

Jan 22, 2026

Linyong Zou, Liang Zhang, Xiongfei Wang, Jia-Hong Gao, Yi Sun, Shurong Sheng, Kuntao Xiao, Wanli Yang, Pengfei Teng, Guoming Luan(+2 more)

Abstract:An essential technique for diagnosing brain disorders is electrophysiological source imaging (ESI). While model-based optimization and deep learning methods have achieved promising results in this field, the accurate selection and refinement of features remains a central challenge for precise ESI. This paper proposes FAIR-ESI, a novel framework that adaptively refines feature importance across different views, including FFT-based spectral feature refinement, weighted temporal feature refinement, and self-attention-based patch-wise feature refinement. Extensive experiments on two simulation datasets with diverse configurations and two real-world clinical datasets validate our framework's efficacy, highlighting its potential to advance brain disorder diagnosis and offer new insights into brain function.

Via

Access Paper or Ask Questions

U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Oct 08, 2025

Fenghe Tang, Chengqi Dong, Wenxin Ma, Zikang Xu, Heqin Zhu, Zihang Jiang, Rongsheng Wang, Yuhao Wang, Chenxu Wu, Shaohua Kevin Zhou

Figure 1 for U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Figure 2 for U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Figure 3 for U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Figure 4 for U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Abstract:Over the past decade, U-Net has been the dominant architecture in medical image segmentation, leading to the development of thousands of U-shaped variants. Despite its widespread adoption, there is still no comprehensive benchmark to systematically evaluate their performance and utility, largely because of insufficient statistical validation and limited consideration of efficiency and generalization across diverse datasets. To bridge this gap, we present U-Bench, the first large-scale, statistically rigorous benchmark that evaluates 100 U-Net variants across 28 datasets and 10 imaging modalities. Our contributions are threefold: (1) Comprehensive Evaluation: U-Bench evaluates models along three key dimensions: statistical robustness, zero-shot generalization, and computational efficiency. We introduce a novel metric, U-Score, which jointly captures the performance-efficiency trade-off, offering a deployment-oriented perspective on model progress. (2) Systematic Analysis and Model Selection Guidance: We summarize key findings from the large-scale evaluation and systematically analyze the impact of dataset characteristics and architectural paradigms on model performance. Based on these insights, we propose a model advisor agent to guide researchers in selecting the most suitable models for specific datasets and tasks. (3) Public Availability: We provide all code, models, protocols, and weights, enabling the community to reproduce our results and extend the benchmark with future methods. In summary, U-Bench not only exposes gaps in previous evaluations but also establishes a foundation for fair, reproducible, and practically relevant benchmarking in the next decade of U-Net-based segmentation models. The project can be accessed at: https://fenghetan9.github.io/ubench. Code is available at: https://github.com/FengheTan9/U-Bench.

* 54 pages. The project can be accessed at: https://fenghetan9.github.io/ubench. Code is available at: https://github.com/FengheTan9/U-Bench

Via

Access Paper or Ask Questions

Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Jun 26, 2025

Qiuyi Qi, Xin Li, Ming Kong, Zikang Xu, Bingdi Chen, Qiang Zhu, S Kevin Zhou

Figure 1 for Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Figure 2 for Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Figure 3 for Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Figure 4 for Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Abstract:Challenges such as the lack of high-quality annotations, long-tailed data distributions, and inconsistent staining styles pose significant obstacles to training neural networks to detect abnormal cells in cytopathology robustly. This paper proposes a style-aligned image composition (SAIC) method that composes high-fidelity and style-preserved pathological images to enhance the effectiveness and robustness of detection models. Without additional training, SAIC first selects an appropriate candidate from the abnormal cell bank based on attribute guidance. Then, it employs a high-frequency feature reconstruction to achieve a style-aligned and high-fidelity composition of abnormal cells and pathological backgrounds. Finally, it introduces a large vision-language model to filter high-quality synthesis images. Experimental results demonstrate that incorporating SAIC-synthesized images effectively enhances the performance and robustness of abnormal cell detection for tail categories and styles, thereby improving overall detection performance. The comprehensive quality evaluation further confirms the generalizability and practicality of SAIC in clinical application scenarios. Our code will be released at https://github.com/Joey-Qi/SAIC.

* MIDL 2025 Oral

Via

Access Paper or Ask Questions

LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Sep 14, 2024

Qiyuan Wang, Shang Zhao, Zikang Xu, S Kevin Zhou

Figure 1 for LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Figure 2 for LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Figure 3 for LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Figure 4 for LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Abstract:Surgical instrument segmentation is instrumental to minimally invasive surgeries and related applications. Most previous methods formulate this task as single-frame-based instance segmentation while ignoring the natural temporal and stereo attributes of a surgical video. As a result, these methods are less robust against the appearance variation through temporal motion and view change. In this work, we propose a novel LACOSTE model that exploits Location-Agnostic COntexts in Stereo and TEmporal images for improved surgical instrument segmentation. Leveraging a query-based segmentation model as core, we design three performance-enhancing modules. Firstly, we design a disparity-guided feature propagation module to enhance depth-aware features explicitly. To generalize well for even only a monocular video, we apply a pseudo stereo scheme to generate complementary right images. Secondly, we propose a stereo-temporal set classifier, which aggregates stereo-temporal contexts in a universal way for making a consolidated prediction and mitigates transient failures. Finally, we propose a location-agnostic classifier to decouple the location bias from mask prediction and enhance the feature semantics. We extensively validate our approach on three public surgical video datasets, including two benchmarks from EndoVis Challenges and one real radical prostatectomy surgery dataset GraSP. Experimental results demonstrate the promising performances of our method, which consistently achieves comparable or favorable results with previous state-of-the-art approaches.

* Preprint submitted to Medical Image Analysis

Via

Access Paper or Ask Questions

FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Jul 01, 2024

Ruinan Jin, Zikang Xu, Yuan Zhong, Qiongsong Yao, Qi Dou, S. Kevin Zhou, Xiaoxiao Li

Figure 1 for FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Figure 2 for FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Figure 3 for FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Figure 4 for FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Abstract:The advent of foundation models (FMs) in healthcare offers unprecedented opportunities to enhance medical diagnostics through automated classification and segmentation tasks. However, these models also raise significant concerns about their fairness, especially when applied to diverse and underrepresented populations in healthcare applications. Currently, there is a lack of comprehensive benchmarks, standardized pipelines, and easily adaptable libraries to evaluate and understand the fairness performance of FMs in medical imaging, leading to considerable challenges in formulating and implementing solutions that ensure equitable outcomes across diverse patient populations. To fill this gap, we introduce FairMedFM, a fairness benchmark for FM research in medical imaging.FairMedFM integrates with 17 popular medical imaging datasets, encompassing different modalities, dimensionalities, and sensitive attributes. It explores 20 widely used FMs, with various usages such as zero-shot learning, linear probing, parameter-efficient fine-tuning, and prompting in various downstream tasks -- classification and segmentation. Our exhaustive analysis evaluates the fairness performance over different evaluation metrics from multiple perspectives, revealing the existence of bias, varied utility-fairness trade-offs on different FMs, consistent disparities on the same datasets regardless FMs, and limited effectiveness of existing unfairness mitigation methods. Checkout FairMedFM's project page and open-sourced codebase, which supports extendible functionalities and applications as well as inclusive for studies on FMs in medical imaging over the long term.

* 29 pages, 17 figures

Via

Access Paper or Ask Questions

APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Mar 08, 2024

Zikang Xu, Fenghe Tang, Quan Quan, Qingsong Yao, S. Kevin Zhou

Figure 1 for APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Figure 2 for APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Figure 3 for APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Figure 4 for APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Abstract:Ensuring fairness in deep-learning-based segmentors is crucial for health equity. Much effort has been dedicated to mitigating unfairness in the training datasets or procedures. However, with the increasing prevalence of foundation models in medical image analysis, it is hard to train fair models from scratch while preserving utility. In this paper, we propose a novel method, Adversarial Privacy-aware Perturbations on Latent Embedding (APPLE), that can improve the fairness of deployed segmentors by introducing a small latent feature perturber without updating the weights of the original model. By adding perturbation to the latent vector, APPLE decorates the latent vector of segmentors such that no fairness-related features can be passed to the decoder of the segmentors while preserving the architecture and parameters of the segmentor. Experiments on two segmentation datasets and five segmentors (three U-Net-like and two SAM-like) illustrate the effectiveness of our proposed method compared to several unfairness mitigation methods.

Via

Access Paper or Ask Questions