Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mingxuan Liu

Towards Practical Alzheimer's Disease Diagnosis: A Lightweight and Interpretable Spiking Neural Model

Jun 11, 2025

Changwei Wu, Yifei Chen, Yuxin Du, Jinying Zong, Jie Dong, Mingxuan Liu, Yong Peng, Jin Fan, Feiwei Qin, Changmiao Wang

Abstract:Early diagnosis of Alzheimer's Disease (AD), especially at the mild cognitive impairment (MCI) stage, is vital yet hindered by subjective assessments and the high cost of multimodal imaging modalities. Although deep learning methods offer automated alternatives, their energy inefficiency and computational demands limit real-world deployment, particularly in resource-constrained settings. As a brain-inspired paradigm, spiking neural networks (SNNs) are inherently well-suited for modeling the sparse, event-driven patterns of neural degeneration in AD, offering a promising foundation for interpretable and low-power medical diagnostics. However, existing SNNs often suffer from weak expressiveness and unstable training, which restrict their effectiveness in complex medical tasks. To address these limitations, we propose FasterSNN, a hybrid neural architecture that integrates biologically inspired LIF neurons with region-adaptive convolution and multi-scale spiking attention. This design enables sparse, efficient processing of 3D MRI while preserving diagnostic accuracy. Experiments on benchmark datasets demonstrate that FasterSNN achieves competitive performance with substantially improved efficiency and stability, supporting its potential for practical AD screening. Our source code is available at https://github.com/wuchangw/FasterSNN.

* 11 pages, 5 figures

Via

Access Paper or Ask Questions

WLTCL: Wide Field-of-View 3-D LiDAR Truck Compartment Automatic Localization System

Apr 26, 2025

Guodong Sun, Mingjing Li, Dingjie Liu, Mingxuan Liu, Bo Wu, Yang Zhang

Abstract:As an essential component of logistics automation, the automated loading system is becoming a critical technology for enhancing operational efficiency and safety. Precise automatic positioning of the truck compartment, which serves as the loading area, is the primary step in automated loading. However, existing methods have difficulty adapting to truck compartments of various sizes, do not establish a unified coordinate system for LiDAR and mobile manipulators, and often exhibit reliability issues in cluttered environments. To address these limitations, our study focuses on achieving precise automatic positioning of key points in large, medium, and small fence-style truck compartments in cluttered scenarios. We propose an innovative wide field-of-view 3-D LiDAR vehicle compartment automatic localization system. For vehicles of various sizes, this system leverages the LiDAR to generate high-density point clouds within an extensive field-of-view range. By incorporating parking area constraints, our vehicle point cloud segmentation method more effectively segments vehicle point clouds within the scene. Our compartment key point positioning algorithm utilizes the geometric features of the compartments to accurately locate the corner points, providing stackable spatial regions. Extensive experiments on our collected data and public datasets demonstrate that this system offers reliable positioning accuracy and reduced computational resource consumption, leading to its application and promotion in relevant fields.

* To appear in IEEE TIM

Via

Access Paper or Ask Questions

seeBias: A Comprehensive Tool for Assessing and Visualizing AI Fairness

Apr 11, 2025

Yilin Ning, Yian Ma, Mingxuan Liu, Xin Li, Nan Liu

Abstract:Fairness in artificial intelligence (AI) prediction models is increasingly emphasized to support responsible adoption in high-stakes domains such as health care and criminal justice. Guidelines and implementation frameworks highlight the importance of both predictive accuracy and equitable outcomes. However, current fairness toolkits often evaluate classification performance disparities in isolation, with limited attention to other critical aspects such as calibration. To address these gaps, we present seeBias, an R package for comprehensive evaluation of model fairness and predictive performance. seeBias offers an integrated evaluation across classification, calibration, and other performance domains, providing a more complete view of model behavior. It includes customizable visualizations to support transparent reporting and responsible AI implementation. Using public datasets from criminal justice and healthcare, we demonstrate how seeBias supports fairness evaluations, and uncovers disparities that conventional fairness metrics may overlook. The R package is available on GitHub, and a Python version is under development.

Via

Access Paper or Ask Questions

A Plug-and-Play Temporal Normalization Module for Robust Remote Photoplethysmography

Nov 22, 2024

Kegang Wang, Jiankai Tang, Yantao Wei, Mingxuan Liu, Xin Liu, Yuntao Wang

Figure 1 for A Plug-and-Play Temporal Normalization Module for Robust Remote Photoplethysmography

Figure 2 for A Plug-and-Play Temporal Normalization Module for Robust Remote Photoplethysmography

Figure 3 for A Plug-and-Play Temporal Normalization Module for Robust Remote Photoplethysmography

Figure 4 for A Plug-and-Play Temporal Normalization Module for Robust Remote Photoplethysmography

Abstract:Remote photoplethysmography (rPPG) extracts PPG signals from subtle color changes in facial videos, showing strong potential for health applications. However, most rPPG methods rely on intensity differences between consecutive frames, missing long-term signal variations affected by motion or lighting artifacts, which reduces accuracy. This paper introduces Temporal Normalization (TN), a flexible plug-and-play module compatible with any end-to-end rPPG network architecture. By capturing long-term temporally normalized features following detrending, TN effectively mitigates motion and lighting artifacts, significantly boosting the rPPG prediction performance. When integrated into four state-of-the-art rPPG methods, TN delivered performance improvements ranging from 34.3% to 94.2% in heart rate measurement tasks across four widely-used datasets. Notably, TN showed even greater performance gains in smaller models. We further discuss and provide insights into the mechanisms behind TN's effectiveness.

Via

Access Paper or Ask Questions

Organizing Unstructured Image Collections using Natural Language

Oct 07, 2024

Mingxuan Liu, Zhun Zhong, Jun Li, Gianni Franchi, Subhankar Roy, Elisa Ricci

Figure 1 for Organizing Unstructured Image Collections using Natural Language

Figure 2 for Organizing Unstructured Image Collections using Natural Language

Figure 3 for Organizing Unstructured Image Collections using Natural Language

Figure 4 for Organizing Unstructured Image Collections using Natural Language

Abstract:Organizing unstructured visual data into semantic clusters is a key challenge in computer vision. Traditional deep clustering (DC) approaches focus on a single partition of data, while multiple clustering (MC) methods address this limitation by uncovering distinct clustering solutions. The rise of large language models (LLMs) and multimodal LLMs (MLLMs) has enhanced MC by allowing users to define clustering criteria in natural language. However, manually specifying criteria for large datasets is impractical. In this work, we introduce the task Semantic Multiple Clustering (SMC) that aims to automatically discover clustering criteria from large image collections, uncovering interpretable substructures without requiring human input. Our framework, Text Driven Semantic Multiple Clustering (TeDeSC), uses text as a proxy to concurrently reason over large image collections, discover partitioning criteria, expressed in natural language, and reveal semantic substructures. To evaluate TeDeSC, we introduce the COCO-4c and Food-4c benchmarks, each containing four grouping criteria and ground-truth annotations. We apply TeDeSC to various applications, such as discovering biases and analyzing social media image popularity, demonstrating its utility as a tool for automatically organizing image collections and revealing novel insights.

* Preprint. Project webpage: https://oatmealliu.github.io/smc.html

Via

Access Paper or Ask Questions

Benchmarking mortality risk prediction from electrocardiograms

Jun 26, 2024

Platon Lukyanenko, Joshua Mayourian, Mingxuan Liu, John K. Triedman, Sunil J. Ghelani, William G. La Cava

Abstract:Several recent high-impact studies leverage large hospital-owned electrocardiographic (ECG) databases to model and predict patient mortality. MIMIC-IV, released September 2023, is the first comparable public dataset and includes 800,000 ECGs from a U.S. hospital system. Previously, the largest public ECG dataset was Code-15, containing 345,000 ECGs collected during routine care in Brazil. These datasets now provide an excellent resource for a broader audience to explore ECG survival modeling. Here, we benchmark survival model performance on Code-15 and MIMIC-IV with two neural network architectures, compare four deep survival modeling approaches to Cox regressions trained on classifier outputs, and evaluate performance at one to ten years. Our results yield AUROC and concordance scores comparable to past work (circa 0.8) and reasonable AUPRC scores (MIMIC-IV: 0.4-0.5, Code-15: 0.05-0.13) considering the fraction of ECG samples linked to a mortality (MIMIC-IV: 27\%, Code-15: 4\%). When evaluating models on the opposite dataset, AUROC and concordance values drop by 0.1-0.15, which may be due to cohort differences. All code and results are made public.

* 9 pages plus appendix, 2 figures

Via

Access Paper or Ask Questions

Retrieval-Augmented Generation for Generative Artificial Intelligence in Medicine

Jun 18, 2024

Rui Yang, Yilin Ning, Emilia Keppo, Mingxuan Liu, Chuan Hong, Danielle S Bitterman, Jasmine Chiat Ling Ong, Daniel Shu Wei Ting, Nan Liu

Abstract:Generative artificial intelligence (AI) has brought revolutionary innovations in various fields, including medicine. However, it also exhibits limitations. In response, retrieval-augmented generation (RAG) provides a potential solution, enabling models to generate more accurate contents by leveraging the retrieval of external knowledge. With the rapid advancement of generative AI, RAG can pave the way for connecting this transformative technology with medical applications and is expected to bring innovations in equity, reliability, and personalization to health care.

Via

Access Paper or Ask Questions

Multiple Prior Representation Learning for Self-Supervised Monocular Depth Estimation via Hybrid Transformer

Jun 13, 2024

Guodong Sun, Junjie Liu, Mingxuan Liu, Moyun Liu, Yang Zhang

Abstract:Self-supervised monocular depth estimation aims to infer depth information without relying on labeled data. However, the lack of labeled information poses a significant challenge to the model's representation, limiting its ability to capture the intricate details of the scene accurately. Prior information can potentially mitigate this issue, enhancing the model's understanding of scene structure and texture. Nevertheless, solely relying on a single type of prior information often falls short when dealing with complex scenes, necessitating improvements in generalization performance. To address these challenges, we introduce a novel self-supervised monocular depth estimation model that leverages multiple priors to bolster representation capabilities across spatial, context, and semantic dimensions. Specifically, we employ a hybrid transformer and a lightweight pose network to obtain long-range spatial priors in the spatial dimension. Then, the context prior attention is designed to improve generalization, particularly in complex structures or untextured areas. In addition, semantic priors are introduced by leveraging semantic boundary loss, and semantic prior attention is supplemented, further refining the semantic features extracted by the decoder. Experiments on three diverse datasets demonstrate the effectiveness of the proposed model. It integrates multiple priors to comprehensively enhance the representation ability, improving the accuracy and reliability of depth estimation. Codes are available at: \url{https://github.com/MVME-HBUT/MPRLNet}

* 28 pages, 12 figures

Via

Access Paper or Ask Questions

Towards Clinical AI Fairness: Filling Gaps in the Puzzle

May 28, 2024

Mingxuan Liu, Yilin Ning, Salinelat Teixayavong, Xiaoxuan Liu, Mayli Mertens, Yuqing Shang, Xin Li, Di Miao, Jie Xu, Daniel Shu Wei Ting(+9 more)

Abstract:The ethical integration of Artificial Intelligence (AI) in healthcare necessitates addressing fairness-a concept that is highly context-specific across medical fields. Extensive studies have been conducted to expand the technical components of AI fairness, while tremendous calls for AI fairness have been raised from healthcare. Despite this, a significant disconnect persists between technical advancements and their practical clinical applications, resulting in a lack of contextualized discussion of AI fairness in clinical settings. Through a detailed evidence gap analysis, our review systematically pinpoints several deficiencies concerning both healthcare data and the provided AI fairness solutions. We highlight the scarcity of research on AI fairness in many medical domains where AI technology is increasingly utilized. Additionally, our analysis highlights a substantial reliance on group fairness, aiming to ensure equality among demographic groups from a macro healthcare system perspective; in contrast, individual fairness, focusing on equity at a more granular level, is frequently overlooked. To bridge these gaps, our review advances actionable strategies for both the healthcare and AI research communities. Beyond applying existing AI fairness methods in healthcare, we further emphasize the importance of involving healthcare professionals to refine AI fairness concepts and methods to ensure contextually relevant and ethically sound AI applications in healthcare.

Via

Access Paper or Ask Questions

SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection

May 16, 2024

Mingxuan Liu, Tyler L. Hayes, Elisa Ricci, Gabriela Csurka, Riccardo Volpi

Abstract:Open-vocabulary object detection (OvOD) has transformed detection into a language-guided task, empowering users to freely define their class vocabularies of interest during inference. However, our initial investigation indicates that existing OvOD detectors exhibit significant variability when dealing with vocabularies across various semantic granularities, posing a concern for real-world deployment. To this end, we introduce Semantic Hierarchy Nexus (SHiNe), a novel classifier that uses semantic knowledge from class hierarchies. It runs offline in three steps: i) it retrieves relevant super-/sub-categories from a hierarchy for each target class; ii) it integrates these categories into hierarchy-aware sentences; iii) it fuses these sentence embeddings to generate the nexus classifier vector. Our evaluation on various detection benchmarks demonstrates that SHiNe enhances robustness across diverse vocabulary granularities, achieving up to +31.9% mAP50 with ground truth hierarchies, while retaining improvements using hierarchies generated by large language models. Moreover, when applied to open-vocabulary classification on ImageNet-1k, SHiNe improves the CLIP zero-shot baseline by +2.8% accuracy. SHiNe is training-free and can be seamlessly integrated with any off-the-shelf OvOD detector, without incurring additional computational overhead during inference. The code is open source.

* Accepted as a conference paper (highlight) at CVPR 2024

Via

Access Paper or Ask Questions