Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Han Liu

Johns Hopkins University

SoK: Security and Privacy Risks of Medical AI

Sep 11, 2024

Yuanhaur Chang, Han Liu, Evin Jaff, Chenyang Lu, Ning Zhang

Figure 1 for SoK: Security and Privacy Risks of Medical AI

Figure 2 for SoK: Security and Privacy Risks of Medical AI

Figure 3 for SoK: Security and Privacy Risks of Medical AI

Figure 4 for SoK: Security and Privacy Risks of Medical AI

Abstract:The integration of technology and healthcare has ushered in a new era where software systems, powered by artificial intelligence and machine learning, have become essential components of medical products and services. While these advancements hold great promise for enhancing patient care and healthcare delivery efficiency, they also expose sensitive medical data and system integrity to potential cyberattacks. This paper explores the security and privacy threats posed by AI/ML applications in healthcare. Through a thorough examination of existing research across a range of medical domains, we have identified significant gaps in understanding the adversarial attacks targeting medical AI systems. By outlining specific adversarial threat models for medical settings and identifying vulnerable application domains, we lay the groundwork for future research that investigates the security and resilience of AI-driven medical systems. Through our analysis of different threat models and feasibility studies on adversarial attacks in different medical domains, we provide compelling insights into the pressing need for cybersecurity research in the rapidly evolving field of AI healthcare technology.

Via

Access Paper or Ask Questions

Flow-Inspired Lightweight Multi-Robot Real-Time Scheduling Planner

Sep 11, 2024

Han Liu, Yu Jin, Tianjiang Hu, Kai Huang

Abstract:Collision avoidance and trajectory planning are crucial in multi-robot systems, particularly in environments with numerous obstacles. Although extensive research has been conducted in this field, the challenge of rapid traversal through such environments has not been fully addressed. This paper addresses this problem by proposing a novel real-time scheduling scheme designed to optimize the passage of multi-robot systems through complex, obstacle-rich maps. Inspired from network flow optimization, our scheme decomposes the environment into a network structure, enabling the efficient allocation of robots to paths based on real-time congestion data. The proposed scheduling planner operates on top of existing collision avoidance algorithms, focusing on minimizing traversal time by balancing robot detours and waiting times. Our simulation results demonstrate the efficiency of the proposed scheme. Additionally, we validated its effectiveness through real world flight tests using ten quadrotors. This work contributes a lightweight, effective scheduling planner capable of meeting the real-time demands of multi-robot systems in obstacle-rich environments.

Via

Access Paper or Ask Questions

Influence of Early through Late Fusion on Pancreas Segmentation from Imperfectly Registered Multimodal MRI

Sep 06, 2024

Lucas W. Remedios, Han Liu, Samuel W. Remedios, Lianrui Zuo, Adam M. Saunders, Shunxing Bao, Yuankai Huo, Alvin C. Powers, John Virostko, Bennett A. Landman

Abstract:Multimodal fusion promises better pancreas segmentation. However, where to perform fusion in models is still an open question. It is unclear if there is a best location to fuse information when analyzing pairs of imperfectly aligned images. Two main alignment challenges in this pancreas segmentation study are 1) the pancreas is deformable and 2) breathing deforms the abdomen. Even after image registration, relevant deformations are often not corrected. We examine how early through late fusion impacts pancreas segmentation. We used 353 pairs of T2-weighted (T2w) and T1-weighted (T1w) abdominal MR images from 163 subjects with accompanying pancreas labels. We used image registration (deeds) to align the image pairs. We trained a collection of basic UNets with different fusion points, spanning from early to late, to assess how early through late fusion influenced segmentation performance on imperfectly aligned images. We assessed generalization of fusion points on nnUNet. The single-modality T2w baseline using a basic UNet model had a Dice score of 0.73, while the same baseline on the nnUNet model achieved 0.80. For the basic UNet, the best fusion approach occurred in the middle of the encoder (early/mid fusion), which led to a statistically significant improvement of 0.0125 on Dice score compared to the baseline. For the nnUNet, the best fusion approach was na\"ive image concatenation before the model (early fusion), which resulted in a statistically significant Dice score increase of 0.0021 compared to baseline. Fusion in specific blocks can improve performance, but the best blocks for fusion are model specific, and the gains are small. In imperfectly registered datasets, fusion is a nuanced problem, with the art of design remaining vital for uncovering potential insights. Future innovation is needed to better address fusion in cases of imperfect alignment of abdominal image pairs.

* 13.5 pages of manuscript content

Via

Access Paper or Ask Questions

Differentially Private Kernel Density Estimation

Sep 03, 2024

Erzhi Liu, Jerry Yao-Chieh Hu, Alex Reneau, Zhao Song, Han Liu

Abstract:We introduce a refined differentially private (DP) data structure for kernel density estimation (KDE), offering not only improved privacy-utility tradeoff but also better efficiency over prior results. Specifically, we study the mathematical problem: given a similarity function $f$ (or DP KDE) and a private dataset $X \subset \mathbb{R}^d$, our goal is to preprocess $X$ so that for any query $y\in\mathbb{R}^d$, we approximate $\sum_{x \in X} f(x, y)$ in a differentially private fashion. The best previous algorithm for $f(x,y) =\| x - y \|_1$ is the node-contaminated balanced binary tree by [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024]. Their algorithm requires $O(nd)$ space and time for preprocessing with $n=|X|$. For any query point, the query time is $d \log n$, with an error guarantee of $(1+\alpha)$-approximation and $\epsilon^{-1} \alpha^{-0.5} d^{1.5} R \log^{1.5} n$. In this paper, we improve the best previous result [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024] in three aspects: - We reduce query time by a factor of $\alpha^{-1} \log n$. - We improve the approximation ratio from $\alpha$ to 1. - We reduce the error dependence by a factor of $\alpha^{-0.5}$. From a technical perspective, our method of constructing the search tree differs from previous work [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024]. In prior work, for each query, the answer is split into $\alpha^{-1} \log n$ numbers, each derived from the summation of $\log n$ values in interval tree countings. In contrast, we construct the tree differently, splitting the answer into $\log n$ numbers, where each is a smart combination of two distance values, two counting values, and $y$ itself. We believe our tree structure may be of independent interest.

Via

Access Paper or Ask Questions

Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Bi-parametric MRI Datasets

Aug 08, 2024

Hao Li, Han Liu, Heinrich von Busch, Robert Grimm, Henkjan Huisman, Angela Tong, David Winkel, Tobias Penzkofer, Ivan Shabunin, Moon Hyung Choi(+9 more)

Figure 1 for Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Bi-parametric MRI Datasets

Figure 2 for Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Bi-parametric MRI Datasets

Figure 3 for Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Bi-parametric MRI Datasets

Figure 4 for Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Bi-parametric MRI Datasets

Abstract:Our hypothesis is that UDA using diffusion-weighted images, generated with a unified model, offers a promising and reliable strategy for enhancing the performance of supervised learning models in multi-site prostate lesion detection, especially when various b-values are present. This retrospective study included data from 5,150 patients (14,191 samples) collected across nine different imaging centers. A novel UDA method using a unified generative model was developed for multi-site PCa detection. This method translates diffusion-weighted imaging (DWI) acquisitions, including apparent diffusion coefficient (ADC) and individual DW images acquired using various b-values, to align with the style of images acquired using b-values recommended by Prostate Imaging Reporting and Data System (PI-RADS) guidelines. The generated ADC and DW images replace the original images for PCa detection. An independent set of 1,692 test cases (2,393 samples) was used for evaluation. The area under the receiver operating characteristic curve (AUC) was used as the primary metric, and statistical analysis was performed via bootstrapping. For all test cases, the AUC values for baseline SL and UDA methods were 0.73 and 0.79 (p<.001), respectively, for PI-RADS>=3, and 0.77 and 0.80 (p<.001) for PI-RADS>=4 PCa lesions. In the 361 test cases under the most unfavorable image acquisition setting, the AUC values for baseline SL and UDA were 0.49 and 0.76 (p<.001) for PI-RADS>=3, and 0.50 and 0.77 (p<.001) for PI-RADS>=4 PCa lesions. The results indicate the proposed UDA with generated images improved the performance of SL methods in multi-site PCa lesion detection across datasets with various b values, especially for images acquired with significant deviations from the PI-RADS recommended DWI protocol (e.g. with an extremely high b-value).

* Accept at Radiology: Artificial Intelligence. Journal reference and external DOI will be added once published

Via

Access Paper or Ask Questions

PatchFinder: A Two-Phase Approach to Security Patch Tracing for Disclosed Vulnerabilities in Open-Source Software

Jul 24, 2024

Kaixuan Li, Jian Zhang, Sen Chen, Han Liu, Yang Liu, Yixiang Chen

Figure 1 for PatchFinder: A Two-Phase Approach to Security Patch Tracing for Disclosed Vulnerabilities in Open-Source Software

Figure 2 for PatchFinder: A Two-Phase Approach to Security Patch Tracing for Disclosed Vulnerabilities in Open-Source Software

Figure 3 for PatchFinder: A Two-Phase Approach to Security Patch Tracing for Disclosed Vulnerabilities in Open-Source Software

Figure 4 for PatchFinder: A Two-Phase Approach to Security Patch Tracing for Disclosed Vulnerabilities in Open-Source Software

Abstract:Open-source software (OSS) vulnerabilities are increasingly prevalent, emphasizing the importance of security patches. However, in widely used security platforms like NVD, a substantial number of CVE records still lack trace links to patches. Although rank-based approaches have been proposed for security patch tracing, they heavily rely on handcrafted features in a single-step framework, which limits their effectiveness. In this paper, we propose PatchFinder, a two-phase framework with end-to-end correlation learning for better-tracing security patches. In the **initial retrieval** phase, we employ a hybrid patch retriever to account for both lexical and semantic matching based on the code changes and the description of a CVE, to narrow down the search space by extracting those commits as candidates that are similar to the CVE descriptions. Afterwards, in the **re-ranking** phase, we design an end-to-end architecture under the supervised fine-tuning paradigm for learning the semantic correlations between CVE descriptions and commits. In this way, we can automatically rank the candidates based on their correlation scores while maintaining low computation overhead. We evaluated our system against 4,789 CVEs from 532 OSS projects. The results are highly promising: PatchFinder achieves a Recall@10 of 80.63% and a Mean Reciprocal Rank (MRR) of 0.7951. Moreover, the Manual Effort@10 required is curtailed to 2.77, marking a 1.94 times improvement over current leading methods. When applying PatchFinder in practice, we initially identified 533 patch commits and submitted them to the official, 482 of which have been confirmed by CVE Numbering Authorities.

* to appear at ISSTA 2024

Via

Access Paper or Ask Questions

VIPS-Odom: Visual-Inertial Odometry Tightly-coupled with Parking Slots for Autonomous Parking

Jul 06, 2024

Xuefeng Jiang, Fangyuan Wang, Rongzhang Zheng, Han Liu, Yixiong Huo, Jinzhang Peng, Lu Tian, Emad Barsoum

Abstract:Precise localization is of great importance for autonomous parking task since it provides service for the downstream planning and control modules, which significantly affects the system performance. For parking scenarios, dynamic lighting, sparse textures, and the instability of global positioning system (GPS) signals pose challenges for most traditional localization methods. To address these difficulties, we propose VIPS-Odom, a novel semantic visual-inertial odometry framework for underground autonomous parking, which adopts tightly-coupled optimization to fuse measurements from multi-modal sensors and solves odometry. Our VIPS-Odom integrates parking slots detected from the synthesized bird-eye-view (BEV) image with traditional feature points in the frontend, and conducts tightly-coupled optimization with joint constraints introduced by measurements from the inertial measurement unit, wheel speed sensor and parking slots in the backend. We develop a multi-object tracking framework to robustly track parking slots' states. To prove the superiority of our method, we equip an electronic vehicle with related sensors and build an experimental platform based on ROS2 system. Extensive experiments demonstrate the efficacy and advantages of our method compared with other baselines for parking scenarios.

* A SLAM Method for Autonomous Parking

Via

Access Paper or Ask Questions

On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

Jul 01, 2024

Jerry Yao-Chieh Hu, Weimin Wu, Zhuoru Li, Zhao Song, Han Liu

Figure 1 for On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

Figure 2 for On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

Abstract:We investigate the statistical and computational limits of latent \textbf{Di}ffusion \textbf{T}ransformers (\textbf{DiT}s) under the low-dimensional linear latent space assumption. Statistically, we study the universal approximation and sample complexity of the DiTs score function, as well as the distribution recovery property of the initial data. Specifically, under mild data assumptions, we derive an approximation error bound for the score network of latent DiTs, which is sub-linear in the latent space dimension. Additionally, we derive the corresponding sample complexity bound and show that the data distribution generated from the estimated score function converges toward a proximate area of the original one. Computationally, we characterize the hardness of both forward inference and backward computation of latent DiTs, assuming the Strong Exponential Time Hypothesis (SETH). For forward inference, we identify efficient criteria for all possible latent DiTs inference algorithms and showcase our theory by pushing the efficiency toward almost-linear time inference. For backward computation, we leverage the low-rank structure within the gradient computation of DiTs training for possible algorithmic speedup. Specifically, we show that such speedup achieves almost-linear time latent DiTs training by casting the DiTs gradient as a series of chained low-rank approximations with bounded error. Under the low-dimensional assumption, we show that the convergence rate and the computational efficiency are both dominated by the dimension of the subspace, suggesting that latent DiTs have the potential to bypass the challenges associated with the high dimensionality of initial data.

Via

Access Paper or Ask Questions

Predicting fluorescent labels in label-free microscopy images with pix2pix and adaptive loss in Light My Cells challenge

Jun 22, 2024

Han Liu, Hao Li, Jiacheng Wang, Yubo Fan, Zhoubing Xu, Ipek Oguz

Figure 1 for Predicting fluorescent labels in label-free microscopy images with pix2pix and adaptive loss in Light My Cells challenge

Figure 2 for Predicting fluorescent labels in label-free microscopy images with pix2pix and adaptive loss in Light My Cells challenge

Figure 3 for Predicting fluorescent labels in label-free microscopy images with pix2pix and adaptive loss in Light My Cells challenge

Figure 4 for Predicting fluorescent labels in label-free microscopy images with pix2pix and adaptive loss in Light My Cells challenge

Abstract:Fluorescence labeling is the standard approach to reveal cellular structures and other subcellular constituents for microscopy images. However, this invasive procedure may perturb or even kill the cells and the procedure itself is highly time-consuming and complex. Recently, in silico labeling has emerged as a promising alternative, aiming to use machine learning models to directly predict the fluorescently labeled images from label-free microscopy. In this paper, we propose a deep learning-based in silico labeling method for the Light My Cells challenge. Built upon pix2pix, our proposed method can be trained using the partially labeled datasets with an adaptive loss. Moreover, we explore the effectiveness of several training strategies to handle different input modalities, such as training them together or separately. The results show that our method achieves promising performance for in silico labeling. Our code is available at https://github.com/MedICL-VU/LightMyCells.

Via

Access Paper or Ask Questions

Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods

Jun 20, 2024

Tim Tsz-Kit Lau, Weijian Li, Chenwei Xu, Han Liu, Mladen Kolar

Figure 1 for Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods

Figure 2 for Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods

Figure 3 for Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods

Figure 4 for Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods

Abstract:Modern deep neural networks often require distributed training with many workers due to their large size. As worker numbers increase, communication overheads become the main bottleneck in data-parallel minibatch stochastic gradient methods with per-iteration gradient synchronization. Local gradient methods like Local SGD reduce communication by only syncing after several local steps. Despite understanding their convergence in i.i.d. and heterogeneous settings and knowing the importance of batch sizes for efficiency and generalization, optimal local batch sizes are difficult to determine. We introduce adaptive batch size strategies for local gradient methods that increase batch sizes adaptively to reduce minibatch gradient variance. We provide convergence guarantees under homogeneous data conditions and support our claims with image classification experiments, demonstrating the effectiveness of our strategies in training and generalization.

Via

Access Paper or Ask Questions