Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qian Zeng

MambaMIL+: Modeling Long-Term Contextual Patterns for Gigapixel Whole Slide Image

Dec 19, 2025

Qian Zeng, Yihui Wang, Shu Yang, Yingxue Xu, Fengtao Zhou, Jiabo Ma, Dejia Cai, Zhengyu Zhang, Lijuan Qu, Yu Wang(+2 more)

Figure 1 for MambaMIL+: Modeling Long-Term Contextual Patterns for Gigapixel Whole Slide Image

Figure 2 for MambaMIL+: Modeling Long-Term Contextual Patterns for Gigapixel Whole Slide Image

Figure 3 for MambaMIL+: Modeling Long-Term Contextual Patterns for Gigapixel Whole Slide Image

Figure 4 for MambaMIL+: Modeling Long-Term Contextual Patterns for Gigapixel Whole Slide Image

Abstract:Whole-slide images (WSIs) are an important data modality in computational pathology, yet their gigapixel resolution and lack of fine-grained annotations challenge conventional deep learning models. Multiple instance learning (MIL) offers a solution by treating each WSI as a bag of patch-level instances, but effectively modeling ultra-long sequences with rich spatial context remains difficult. Recently, Mamba has emerged as a promising alternative for long sequence learning, scaling linearly to thousands of tokens. However, despite its efficiency, it still suffers from limited spatial context modeling and memory decay, constraining its effectiveness to WSI analysis. To address these limitations, we propose MambaMIL+, a new MIL framework that explicitly integrates spatial context while maintaining long-range dependency modeling without memory forgetting. Specifically, MambaMIL+ introduces 1) overlapping scanning, which restructures the patch sequence to embed spatial continuity and instance correlations; 2) a selective stripe position encoder (S2PE) that encodes positional information while mitigating the biases of fixed scanning orders; and 3) a contextual token selection (CTS) mechanism, which leverages supervisory knowledge to dynamically enlarge the contextual memory for stable long-range modeling. Extensive experiments on 20 benchmarks across diagnostic classification, molecular prediction, and survival analysis demonstrate that MambaMIL+ consistently achieves state-of-the-art performance under three feature extractors (ResNet-50, PLIP, and CONCH), highlighting its effectiveness and robustness for large-scale computational pathology

* 18 pages, 11 figures, 10 tables

Via

Access Paper or Ask Questions

Diffusion Model Quantization: A Review

May 08, 2025

Qian Zeng, Chenggong Hu, Mingli Song, Jie Song

Figure 1 for Diffusion Model Quantization: A Review

Figure 2 for Diffusion Model Quantization: A Review

Figure 3 for Diffusion Model Quantization: A Review

Figure 4 for Diffusion Model Quantization: A Review

Abstract:Recent success of large text-to-image models has empirically underscored the exceptional performance of diffusion models in generative tasks. To facilitate their efficient deployment on resource-constrained edge devices, model quantization has emerged as a pivotal technique for both compression and acceleration. This survey offers a thorough review of the latest advancements in diffusion model quantization, encapsulating and analyzing the current state of the art in this rapidly advancing domain. First, we provide an overview of the key challenges encountered in the quantization of diffusion models, including those based on U-Net architectures and Diffusion Transformers (DiT). We then present a comprehensive taxonomy of prevalent quantization techniques, engaging in an in-depth discussion of their underlying principles. Subsequently, we perform a meticulous analysis of representative diffusion model quantization schemes from both qualitative and quantitative perspectives. From a quantitative standpoint, we rigorously benchmark a variety of methods using widely recognized datasets, delivering an extensive evaluation of the most recent and impactful research in the field. From a qualitative standpoint, we categorize and synthesize the effects of quantization errors, elucidating these impacts through both visual analysis and trajectory examination. In conclusion, we outline prospective avenues for future research, proposing novel directions for the quantization of generative models in practical applications. The list of related papers, corresponding codes, pre-trained models and comparison results are publicly available at the survey project homepage https://github.com/TaylorJocelyn/Diffusion-Model-Quantization.

* 40 pages, 8 figures

Via

Access Paper or Ask Questions

Quantizing Diffusion Models from a Sampling-Aware Perspective

May 04, 2025

Qian Zeng, Jie Song, Yuanyu Wan, Huiqiong Wang, Mingli Song

Figure 1 for Quantizing Diffusion Models from a Sampling-Aware Perspective

Figure 2 for Quantizing Diffusion Models from a Sampling-Aware Perspective

Figure 3 for Quantizing Diffusion Models from a Sampling-Aware Perspective

Figure 4 for Quantizing Diffusion Models from a Sampling-Aware Perspective

Abstract:Diffusion models have recently emerged as the dominant approach in visual generation tasks. However, the lengthy denoising chains and the computationally intensive noise estimation networks hinder their applicability in low-latency and resource-limited environments. Previous research has endeavored to address these limitations in a decoupled manner, utilizing either advanced samplers or efficient model quantization techniques. In this study, we uncover that quantization-induced noise disrupts directional estimation at each sampling step, further distorting the precise directional estimations of higher-order samplers when solving the sampling equations through discretized numerical methods, thereby altering the optimal sampling trajectory. To attain dual acceleration with high fidelity, we propose a sampling-aware quantization strategy, wherein a Mixed-Order Trajectory Alignment technique is devised to impose a more stringent constraint on the error bounds at each sampling step, facilitating a more linear probability flow. Extensive experiments on sparse-step fast sampling across multiple datasets demonstrate that our approach preserves the rapid convergence characteristics of high-speed samplers while maintaining superior generation quality. Code will be made publicly available soon.

* 11 pages, 4 figures

Via

Access Paper or Ask Questions

Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach

Mar 09, 2025

Qian Zeng, Xin Lin, Jingyi Gao, Yang Yu

Figure 1 for Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach

Figure 2 for Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach

Figure 3 for Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach

Figure 4 for Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach

Abstract:Previous studies have demonstrated the strong performance of Graph Neural Networks (GNNs) in node classification. However, most existing GNNs adopt a node-centric perspective and rely on global message passing, leading to high computational and memory costs that hinder scalability. To mitigate these challenges, subgraph-based methods have been introduced, leveraging local subgraphs as approximations of full computational trees. While this approach improves efficiency, it often suffers from performance degradation due to the loss of global contextual information, limiting its effectiveness compared to global GNNs. To address this trade-off between scalability and classification accuracy, we reformulate the node classification task as a subgraph classification problem and propose SubGND (Subgraph GNN for NoDe). This framework introduces a differentiated zero-padding strategy and an Ego-Alter subgraph representation method to resolve label conflicts while incorporating an Adaptive Feature Scaling Mechanism to dynamically adjust feature contributions based on dataset-specific dependencies. Experimental results on six benchmark datasets demonstrate that SubGND achieves performance comparable to or surpassing global message-passing GNNs, particularly in heterophilic settings, highlighting its effectiveness and scalability as a promising solution for node classification.

* 16 pages

Via

Access Paper or Ask Questions

D$^2$-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

Jan 14, 2025

Qian Zeng, Jie Song, Han Zheng, Hao Jiang, Mingli Song

Figure 1 for D$^2$-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

Figure 2 for D$^2$-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

Figure 3 for D$^2$-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

Figure 4 for D$^2$-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

Abstract:Diffusion models have achieved cutting-edge performance in image generation. However, their lengthy denoising process and computationally intensive score estimation network impede their scalability in low-latency and resource-constrained scenarios. Post-training quantization (PTQ) compresses and accelerates diffusion models without retraining, but it inevitably introduces additional quantization noise, resulting in mean and variance deviations. In this work, we propose D2-DPM, a dual denoising mechanism aimed at precisely mitigating the adverse effects of quantization noise on the noise estimation network. Specifically, we first unravel the impact of quantization noise on the sampling equation into two components: the mean deviation and the variance deviation. The mean deviation alters the drift coefficient of the sampling equation, influencing the trajectory trend, while the variance deviation magnifies the diffusion coefficient, impacting the convergence of the sampling trajectory. The proposed D2-DPM is thus devised to denoise the quantization noise at each time step, and then denoise the noisy sample through the inverse diffusion iterations. Experimental results demonstrate that D2-DPM achieves superior generation quality, yielding a 1.42 lower FID than the full-precision model while achieving 3.99x compression and 11.67x bit-operation acceleration.

* 9 pages, 4 figures, acceptted by AAAI2025

Via

Access Paper or Ask Questions

Graph-Guided Test-Time Adaptation for Glaucoma Diagnosis using Fundus Photography

Jul 05, 2024

Qian Zeng, Fan Zhang

Figure 1 for Graph-Guided Test-Time Adaptation for Glaucoma Diagnosis using Fundus Photography

Figure 2 for Graph-Guided Test-Time Adaptation for Glaucoma Diagnosis using Fundus Photography

Figure 3 for Graph-Guided Test-Time Adaptation for Glaucoma Diagnosis using Fundus Photography

Figure 4 for Graph-Guided Test-Time Adaptation for Glaucoma Diagnosis using Fundus Photography

Abstract:Glaucoma is a leading cause of irreversible blindness worldwide. While deep learning approaches using fundus images have largely improved early diagnosis of glaucoma, variations in images from different devices and locations (known as domain shifts) challenge the use of pre-trained models in real-world settings. To address this, we propose a novel Graph-guided Test-Time Adaptation (GTTA) framework to generalize glaucoma diagnosis models to unseen test environments. GTTA integrates the topological information of fundus images into the model training, enhancing the model's transferability and reducing the risk of learning spurious correlation. During inference, GTTA introduces a novel test-time training objective to make the source-trained classifier progressively adapt to target patterns with reliable class conditional estimation and consistency regularization. Experiments on cross-domain glaucoma diagnosis benchmarks demonstrate the superiority of the overall framework and individual components under different backbone networks.

* 11 pages, 3 figures, 3 tables, submitted to MICCAI

Via

Access Paper or Ask Questions