Alert button
Picture for Xuhong Li

Xuhong Li

Alert button

CUPre: Cross-domain Unsupervised Pre-training for Few-Shot Cell Segmentation

Oct 06, 2023
Weibin Liao, Xuhong Li, Qingzhong Wang, Yanwu Xu, Zhaozheng Yin, Haoyi Xiong

While pre-training on object detection tasks, such as Common Objects in Contexts (COCO) [1], could significantly boost the performance of cell segmentation, it still consumes on massive fine-annotated cell images [2] with bounding boxes, masks, and cell types for every cell in every image, to fine-tune the pre-trained model. To lower the cost of annotation, this work considers the problem of pre-training DNN models for few-shot cell segmentation, where massive unlabeled cell images are available but only a small proportion is annotated. Hereby, we propose Cross-domain Unsupervised Pre-training, namely CUPre, transferring the capability of object detection and instance segmentation for common visual objects (learned from COCO) to the visual domain of cells using unlabeled images. Given a standard COCO pre-trained network with backbone, neck, and head modules, CUPre adopts an alternate multi-task pre-training (AMT2) procedure with two sub-tasks -- in every iteration of pre-training, AMT2 first trains the backbone with cell images from multiple cell datasets via unsupervised momentum contrastive learning (MoCo) [3], and then trains the whole model with vanilla COCO datasets via instance segmentation. After pre-training, CUPre fine-tunes the whole model on the cell segmentation task using a few annotated images. We carry out extensive experiments to evaluate CUPre using LIVECell [2] and BBBC038 [4] datasets in few-shot instance segmentation settings. The experiment shows that CUPre can outperform existing pre-training methods, achieving the highest average precision (AP) for few-shot cell segmentation and detection.

Viaarxiv icon

MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep Models for X-ray Images of Multiple Body Parts

Oct 03, 2023
Weibin Liao, Haoyi Xiong, Qingzhong Wang, Yan Mo, Xuhong Li, Yi Liu, Zeyu Chen, Siyu Huang, Dejing Dou

While self-supervised learning (SSL) algorithms have been widely used to pre-train deep models, few efforts [11] have been done to improve representation learning of X-ray image analysis with SSL pre-trained models. In this work, we study a novel self-supervised pre-training pipeline, namely Multi-task Self-super-vised Continual Learning (MUSCLE), for multiple medical imaging tasks, such as classification and segmentation, using X-ray images collected from multiple body parts, including heads, lungs, and bones. Specifically, MUSCLE aggregates X-rays collected from multiple body parts for MoCo-based representation learning, and adopts a well-designed continual learning (CL) procedure to further pre-train the backbone subject various X-ray analysis tasks jointly. Certain strategies for image pre-processing, learning schedules, and regularization have been used to solve data heterogeneity, overfitting, and catastrophic forgetting problems for multi-task/dataset learning in MUSCLE.We evaluate MUSCLE using 9 real-world X-ray datasets with various tasks, including pneumonia classification, skeletal abnormality classification, lung segmentation, and tuberculosis (TB) detection. Comparisons against other pre-trained models [7] confirm the proof-of-concept that self-supervised multi-task/dataset continual pre-training could boost the performance of X-ray image analysis.

* accepted by Medical Image Computing and Computer Assisted Intervention (MICCAI) 2022 
Viaarxiv icon

Propagation Modeling for Physically Large Arrays: Measurements and Multipath Component Visibility

May 10, 2023
Thomas Wilding, Benjamin J. B. Deutschmann, Christian Nelson, Xuhong Li, Fredrik Tufvesson, Klaus Witrisal

Figure 1 for Propagation Modeling for Physically Large Arrays: Measurements and Multipath Component Visibility
Figure 2 for Propagation Modeling for Physically Large Arrays: Measurements and Multipath Component Visibility
Figure 3 for Propagation Modeling for Physically Large Arrays: Measurements and Multipath Component Visibility
Figure 4 for Propagation Modeling for Physically Large Arrays: Measurements and Multipath Component Visibility

This paper deals with propagation and channel modeling for physically large arrays. The focus lies on acquiring a spatially consistent model, which is essential, especially for positioning and sensing applications. Ultra-wideband, synthetic array measurement data have been acquired with large positioning devices to support this research. We present a modified multipath channel model that accounts for a varying visibility of multipath components along a large array. Based on a geometric model of the measurement environment, we analyze the visibility of specular components. We show that, depending on the size of the reflecting surface, geometric visibility and amplitude estimates obtained with a super-resolution channel estimation algorithm show a strong correspondence. Furthermore, we highlight the capabilities of the developed synthetic array measurement system.

* 6 pages, 6 figures, submitted to EuCNC-2023 
Viaarxiv icon

Large Intelligent Surface Measurements for Joint Communication and Sensing

Apr 24, 2023
Christian Nelson, Xuhong Li, Thomas Wilding, Benjamin Deutschmann, Klaus Witrisal, Fredrik Tufvesson

Figure 1 for Large Intelligent Surface Measurements for Joint Communication and Sensing
Figure 2 for Large Intelligent Surface Measurements for Joint Communication and Sensing
Figure 3 for Large Intelligent Surface Measurements for Joint Communication and Sensing
Figure 4 for Large Intelligent Surface Measurements for Joint Communication and Sensing

Multiple concepts for future generations of wireless communication standards utilize coherent processing of signals from many distributed antennas. Names for these concepts include distributed MIMO, cell-free massive MIMO, XL-MIMO, and large intelligent surfaces. They aim to improve communication reliability, capacity, as well as energy efficiency and provide possibilities for new applications through joint communication and sensing. One such recently proposed solution is the concept of RadioWeaves. It proposes a new radio infrastructure for distributed MIMO with distributed internal processing, storage, and compute resources integrated into the infrastructure. The large bandwidths available in the higher bands have inspired much work regarding sensing in the mmWave- and sub-THz-bands, however, sub-6 GHz cellular bands will still be the main provider of broad cellular coverage due to the more favorable propagation conditions. In this paper, we present results from a sub-6 GHz measurement campaign targeting the non-stationary spatial channel statistics for a large RadioWeave and the temporal non-stationarity in a dynamic scenario with RadioWeaves. From the results, we also predict the possibility of multi-static sensing and positioning of users in the environment.

* 6 pages, 2 columns, 12 figures, IEEE European Conference on Networks and Communications & 6G Summit 2023 
Viaarxiv icon

Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability

Apr 01, 2023
Haoyi Xiong, Xuhong Li, Boyang Yu, Zhanxing Zhu, Dongrui Wu, Dejing Dou

Figure 1 for Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability
Figure 2 for Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability
Figure 3 for Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability
Figure 4 for Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability

Random label noises (or observational noises) widely exist in practical machine learning settings. While previous studies primarily focus on the affects of label noises to the performance of learning, our work intends to investigate the implicit regularization effects of the label noises, under mini-batch sampling settings of stochastic gradient descent (SGD), with assumptions that label noises are unbiased. Specifically, we analyze the learning dynamics of SGD over the quadratic loss with unbiased label noises, where we model the dynamics of SGD as a stochastic differentiable equation (SDE) with two diffusion terms (namely a Doubly Stochastic Model). While the first diffusion term is caused by mini-batch sampling over the (label-noiseless) loss gradients as many other works on SGD, our model investigates the second noise term of SGD dynamics, which is caused by mini-batch sampling over the label noises, as an implicit regularizer. Our theoretical analysis finds such implicit regularizer would favor some convergence points that could stabilize model outputs against perturbation of parameters (namely inference stability). Though similar phenomenon have been investigated, our work doesn't assume SGD as an Ornstein-Uhlenbeck like process and achieve a more generalizable result with convergence of approximation proved. To validate our analysis, we design two sets of empirical studies to analyze the implicit regularizer of SGD with unbiased random label noises for deep neural networks training and linear regression.

* The complete manuscript of our previous submission to ICLR'21 (https://openreview.net/forum?id=g4szfsQUdy3). This manuscript was major done in 2021. We gave try to some venues but unfortunately haven't made it accepted yet 
Viaarxiv icon

Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features

Dec 20, 2022
Qingrui Jia, Xuhong Li, Lei Yu, Jiang Bian, Penghao Zhao, Shupeng Li, Haoyi Xiong, Dejing Dou

Figure 1 for Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features
Figure 2 for Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features
Figure 3 for Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features
Figure 4 for Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features

While mislabeled or ambiguously-labeled samples in the training set could negatively affect the performance of deep models, diagnosing the dataset and identifying mislabeled samples helps to improve the generalization power. Training dynamics, i.e., the traces left by iterations of optimization algorithms, have recently been proved to be effective to localize mislabeled samples with hand-crafted features. In this paper, beyond manually designed features, we introduce a novel learning-based solution, leveraging a noise detector, instanced by an LSTM network, which learns to predict whether a sample was mislabeled using the raw training dynamics as input. Specifically, the proposed method trains the noise detector in a supervised manner using the dataset with synthesized label noises and can adapt to various datasets (either naturally or synthesized label-noised) without retraining. We conduct extensive experiments to evaluate the proposed method. We train the noise detector based on the synthesized label-noised CIFAR dataset and test such noise detector on Tiny ImageNet, CUB-200, Caltech-256, WebVision and Clothing1M. Results show that the proposed method precisely detects mislabeled samples on various datasets without further adaptation, and outperforms state-of-the-art methods. Besides, more experiments demonstrate that the mislabel identification can guide a label correction, namely data debugging, providing orthogonal improvements of algorithm-centric state-of-the-art techniques from the data aspect.

* AAAI23 accepted Conference Paper 
Viaarxiv icon

High-Resolution Channel Sounding and Parameter Estimation in Multi-Site Cellular Networks

Nov 17, 2022
Junshi Chen, Russ Whiton, Xuhong Li, Fredrik Tufvesson

Figure 1 for High-Resolution Channel Sounding and Parameter Estimation in Multi-Site Cellular Networks
Figure 2 for High-Resolution Channel Sounding and Parameter Estimation in Multi-Site Cellular Networks
Figure 3 for High-Resolution Channel Sounding and Parameter Estimation in Multi-Site Cellular Networks
Figure 4 for High-Resolution Channel Sounding and Parameter Estimation in Multi-Site Cellular Networks

Accurate understanding of electromagnetic propagation properties in real environments is necessary for efficient design and deployment of cellular systems. In this paper, we show a method to estimate high-resolution channel parameters with a massive antenna array in real network deployments. An antenna array mounted on a vehicle is used to receive downlink long-term evolution (LTE) reference signals from neighboring base stations (BS) with mutual interference. Delay and angular information of multipath components is estimated with a novel inter-cell interference cancellation algorithm and an extension of the RIMAX algorithm. The estimated high-resolution channel parameters are consistent with the movement pattern of the vehicle and the geometry of the environment and allow for refined channel modeling and precise cellular positioning.

Viaarxiv icon

$\textbf{P$^2$A}$: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos

Jul 26, 2022
Jiang Bian, Qingzhong Wang, Haoyi Xiong, Jun Huang, Chen Liu, Xuhong Li, Jun Cheng, Jun Zhao, Feixiang Lu, Dejing Dou

Figure 1 for $\textbf{P$^2$A}$: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos
Figure 2 for $\textbf{P$^2$A}$: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos
Figure 3 for $\textbf{P$^2$A}$: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos
Figure 4 for $\textbf{P$^2$A}$: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos

While deep learning has been widely used for video analytics, such as video classification and action detection, dense action detection with fast-moving subjects from sports videos is still challenging. In this work, we release yet another sports video dataset $\textbf{P$^2$A}$ for $\underline{P}$ing $\underline{P}$ong-$\underline{A}$ction detection, which consists of 2,721 video clips collected from the broadcasting videos of professional table tennis matches in World Table Tennis Championships and Olympiads. We work with a crew of table tennis professionals and referees to obtain fine-grained action labels (in 14 classes) for every ping-pong action that appeared in the dataset and formulate two sets of action detection problems - action localization and action recognition. We evaluate a number of commonly-seen action recognition (e.g., TSM, TSN, Video SwinTransformer, and Slowfast) and action localization models (e.g., BSN, BSN++, BMN, TCANet), using $\textbf{P$^2$A}$ for both problems, under various settings. These models can only achieve 48% area under the AR-AN curve for localization and 82% top-one accuracy for recognition since the ping-pong actions are dense with fast-moving subjects but broadcasting videos are with only 25 FPS. The results confirm that $\textbf{P$^2$A}$ is still a challenging task and can be used as a benchmark for action detection from videos.

Viaarxiv icon

Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training of Image Segmentation Models

Jul 04, 2022
Xuhong Li, Haoyi Xiong, Yi Liu, Dingfu Zhou, Zeyu Chen, Yaqing Wang, Dejing Dou

While fine-tuning pre-trained networks has become a popular way to train image segmentation models, such backbone networks for image segmentation are frequently pre-trained using image classification source datasets, e.g., ImageNet. Though image classification datasets could provide the backbone networks with rich visual features and discriminative ability, they are incapable of fully pre-training the target model (i.e., backbone+segmentation modules) in an end-to-end manner. The segmentation modules are left to random initialization in the fine-tuning process due to the lack of segmentation labels in classification datasets. In our work, we propose a method that leverages Pseudo Semantic Segmentation Labels (PSSL), to enable the end-to-end pre-training for image segmentation models based on classification datasets. PSSL was inspired by the observation that the explanation results of classification models, obtained through explanation algorithms such as CAM, SmoothGrad and LIME, would be close to the pixel clusters of visual objects. Specifically, PSSL is obtained for each image by interpreting the classification results and aggregating an ensemble of explanations queried from multiple classifiers to lower the bias caused by single models. With PSSL for every image of ImageNet, the proposed method leverages a weighted segmentation learning procedure to pre-train the segmentation network en masse. Experiment results show that, with ImageNet accompanied by PSSL as the source dataset, the proposed end-to-end pre-training strategy successfully boosts the performance of various segmentation models, i.e., PSPNet-ResNet50, DeepLabV3-ResNet50, and OCRNet-HRNetW18, on a number of segmentation tasks, such as CamVid, VOC-A, VOC-C, ADE20K, and CityScapes, with significant improvements. The source code is availabel at https://github.com/PaddlePaddle/PaddleSeg.

* Accepted by Machine Learning 
Viaarxiv icon

Sequential Detection and Estimation of Multipath Channel Parameters Using Belief Propagation

Sep 12, 2021
Xuhong Li, Erik Leitinger, Alexander Venus, Fredrik Tufvesson

Figure 1 for Sequential Detection and Estimation of Multipath Channel Parameters Using Belief Propagation
Figure 2 for Sequential Detection and Estimation of Multipath Channel Parameters Using Belief Propagation
Figure 3 for Sequential Detection and Estimation of Multipath Channel Parameters Using Belief Propagation
Figure 4 for Sequential Detection and Estimation of Multipath Channel Parameters Using Belief Propagation

This paper proposes a belief propagation (BP)-based algorithm for sequential detection and estimation of multipath components (MPCs) parameters based on radio signals. Under dynamic channel conditions with moving transmitter and/or receiver, the number of MPCs reflected from visible geometric features, the MPC dispersion parameters (delay, angle, Doppler frequency, etc), and the number of false alarm contributions are unknown and time-varying. We develop a Bayesian model for sequential detection and estimation of MPC dispersion parameters, and represent it by a factor graph enabling the use of BP for efficient computation of the marginal posterior distributions. At each time instance, a snapshot-based channel estimator provides parameter estimates of a set of MPCs which are used as noisy measurements by the proposed BP-based algorithm. It performs joint probabilistic data association, estimation of the time-varying MPC parameters, and the mean number of false alarm measurements by means of the sum-product algorithm rules. The results using synthetic measurements show that the proposed algorithm is able to cope with a high number of false alarm measurements originating from the snapshot-based channel estimator and to sequentially detect and estimate MPCs parameters with very low signal-to-noise ratio (SNR). The performance of the proposed algorithm compares well to existing algorithms for high SNR MPCs, but significantly it outperforms them for medium or low SNR MPCs. In particular, we show that our algorithm outperforms the Kalman enhanced super resolution tracking (KEST) algorithm, a state-of-the-art sequential channel parameters estimation method. Furthermore, results with real radio measurements demonstrate the excellent performance of the algorithm in realistic and challenging scenarios.

* 35 pages (single column), 9 figures. To be submitted to the IEEE Transaction on Wireless Communications for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 
Viaarxiv icon