Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuliang Gu

LIGM

From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection

Feb 25, 2026

Yepeng Liu, Hao Li, Liwen Yang, Fangzhen Li, Xudi Ge, Yuliang Gu, kuang Gao, Bing Wang, Guang Chen, Hangjun Ye(+1 more)

Abstract:Keypoint-based matching is a fundamental component of modern 3D vision systems, such as Structure-from-Motion (SfM) and SLAM. Most existing learning-based methods are trained on image pairs, a paradigm that fails to explicitly optimize for the long-term trackability of keypoints across sequences under challenging viewpoint and illumination changes. In this paper, we reframe keypoint detection as a sequential decision-making problem. We introduce TraqPoint, a novel, end-to-end Reinforcement Learning (RL) framework designed to optimize the \textbf{Tra}ck-\textbf{q}uality (Traq) of keypoints directly on image sequences. Our core innovation is a track-aware reward mechanism that jointly encourages the consistency and distinctiveness of keypoints across multiple views, guided by a policy gradient method. Extensive evaluations on sparse matching benchmarks, including relative pose estimation and 3D reconstruction, demonstrate that TraqPoint significantly outperforms some state-of-the-art (SOTA) keypoint detection and description methods.

* There are unresolved issues regarding authorship and manuscript details. We withdraw this submission to make necessary corrections

Via

Access Paper or Ask Questions

Consistent-Point: Consistent Pseudo-Points for Semi-Supervised Crowd Counting and Localization

Mar 16, 2025

Yuda Zou, Zelong Liu, Yuliang Gu, Bo Du, Yongchao Xu

Figure 1 for Consistent-Point: Consistent Pseudo-Points for Semi-Supervised Crowd Counting and Localization

Figure 2 for Consistent-Point: Consistent Pseudo-Points for Semi-Supervised Crowd Counting and Localization

Figure 3 for Consistent-Point: Consistent Pseudo-Points for Semi-Supervised Crowd Counting and Localization

Figure 4 for Consistent-Point: Consistent Pseudo-Points for Semi-Supervised Crowd Counting and Localization

Abstract:Crowd counting and localization are important in applications such as public security and traffic management. Existing methods have achieved impressive results thanks to extensive laborious annotations. This paper propose a novel point-localization-based semi-supervised crowd counting and localization method termed Consistent-Point. We identify and address two inconsistencies of pseudo-points, which have not been adequately explored. To enhance their position consistency, we aggregate the positions of neighboring auxiliary proposal-points. Additionally, an instance-wise uncertainty calibration is proposed to improve the class consistency of pseudo-points. By generating more consistent pseudo-points, Consistent-Point provides more stable supervision to the training process, yielding improved results. Extensive experiments across five widely used datasets and three different labeled ratio settings demonstrate that our method achieves state-of-the-art performance in crowd localization while also attaining impressive crowd counting results. The code will be available.

Via

Access Paper or Ask Questions

Task-Parameter Nexus: Task-Specific Parameter Learning for Model-Based Control

Dec 17, 2024

Sheng Cheng, Ran Tao, Yuliang Gu, Shenlong Wang, Xiaofeng Wang, Naira Hovakimyan

Figure 1 for Task-Parameter Nexus: Task-Specific Parameter Learning for Model-Based Control

Figure 2 for Task-Parameter Nexus: Task-Specific Parameter Learning for Model-Based Control

Figure 3 for Task-Parameter Nexus: Task-Specific Parameter Learning for Model-Based Control

Figure 4 for Task-Parameter Nexus: Task-Specific Parameter Learning for Model-Based Control

Abstract:This paper presents the Task-Parameter Nexus (TPN), a learning-based approach for online determination of the (near-)optimal control parameters of model-based controllers (MBCs) for tracking tasks. In TPN, a deep neural network is introduced to predict the control parameters for any given tracking task at runtime, especially when optimal parameters for new tasks are not immediately available. To train this network, we constructed a trajectory bank with various speeds and curvatures that represent different motion characteristics. Then, for each trajectory in the bank, we auto-tune the optimal control parameters offline and use them as the corresponding ground truth. With this dataset, the TPN is trained by supervised learning. We evaluated the TPN on the quadrotor platform. In simulation experiments, it is shown that the TPN can predict near-optimal control parameters for a spectrum of tracking tasks, demonstrating its robust generalization capabilities to unseen tasks.

Via

Access Paper or Ask Questions

Shape Transformation Driven by Active Contour for Class-Imbalanced Semi-Supervised Medical Image Segmentation

Oct 18, 2024

Yuliang Gu, Yepeng Liu, Zhichao Sun, Jinchi Zhu, Yongchao Xu, Laurent Najman

Figure 1 for Shape Transformation Driven by Active Contour for Class-Imbalanced Semi-Supervised Medical Image Segmentation

Figure 2 for Shape Transformation Driven by Active Contour for Class-Imbalanced Semi-Supervised Medical Image Segmentation

Figure 3 for Shape Transformation Driven by Active Contour for Class-Imbalanced Semi-Supervised Medical Image Segmentation

Figure 4 for Shape Transformation Driven by Active Contour for Class-Imbalanced Semi-Supervised Medical Image Segmentation

Abstract:Annotating 3D medical images demands expert knowledge and is time-consuming. As a result, semi-supervised learning (SSL) approaches have gained significant interest in 3D medical image segmentation. The significant size differences among various organs in the human body lead to imbalanced class distribution, which is a major challenge in the real-world application of these SSL approaches. To address this issue, we develop a novel Shape Transformation driven by Active Contour (STAC), that enlarges smaller organs to alleviate imbalanced class distribution across different organs. Inspired by curve evolution theory in active contour methods, STAC employs a signed distance function (SDF) as the level set function, to implicitly represent the shape of organs, and deforms voxels in the direction of the steepest descent of SDF (i.e., the normal vector). To ensure that the voxels far from expansion organs remain unchanged, we design an SDF-based weight function to control the degree of deformation for each voxel. We then use STAC as a data-augmentation process during the training stage. Experimental results on two benchmark datasets demonstrate that the proposed method significantly outperforms some state-of-the-art methods. Source code is publicly available at https://github.com/GuGuLL123/STAC.

* 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Dec 2024, Lisbon (Portugal), Portugal

Via

Access Paper or Ask Questions

Progressive Retinal Image Registration via Global and Local Deformable Transformations

Sep 02, 2024

Yepeng Liu, Baosheng Yu, Tian Chen, Yuliang Gu, Bo Du, Yongchao Xu, Jun Cheng

Figure 1 for Progressive Retinal Image Registration via Global and Local Deformable Transformations

Figure 2 for Progressive Retinal Image Registration via Global and Local Deformable Transformations

Figure 3 for Progressive Retinal Image Registration via Global and Local Deformable Transformations

Figure 4 for Progressive Retinal Image Registration via Global and Local Deformable Transformations

Abstract:Retinal image registration plays an important role in the ophthalmological diagnosis process. Since there exist variances in viewing angles and anatomical structures across different retinal images, keypoint-based approaches become the mainstream methods for retinal image registration thanks to their robustness and low latency. These methods typically assume the retinal surfaces are planar, and adopt feature matching to obtain the homography matrix that represents the global transformation between images. Yet, such a planar hypothesis inevitably introduces registration errors since retinal surface is approximately curved. This limitation is more prominent when registering image pairs with significant differences in viewing angles. To address this problem, we propose a hybrid registration framework called HybridRetina, which progressively registers retinal images with global and local deformable transformations. For that, we use a keypoint detector and a deformation network called GAMorph to estimate the global transformation and local deformable transformation, respectively. Specifically, we integrate multi-level pixel relation knowledge to guide the training of GAMorph. Additionally, we utilize an edge attention module that includes the geometric priors of the images, ensuring the deformation field focuses more on the vascular regions of clinical interest. Experiments on two widely-used datasets, FIRE and FLoRI21, show that our proposed HybridRetina significantly outperforms some state-of-the-art methods. The code is available at https://github.com/lyp-deeplearning/awesome-retinal-registration.

* Accepted at BIBM 2024

Via

Access Paper or Ask Questions

Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays

May 20, 2024

Zhichao Sun, Yuliang Gu, Yepeng Liu, Zerui Zhang, Zhou Zhao, Yongchao Xu

Figure 1 for Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays

Figure 2 for Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays

Figure 3 for Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays

Figure 4 for Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays

Abstract:Anomaly detection in chest X-rays is a critical task. Most methods mainly model the distribution of normal images, and then regard significant deviation from normal distribution as anomaly. Recently, CLIP-based methods, pre-trained on a large number of medical images, have shown impressive performance on zero/few-shot downstream tasks. In this paper, we aim to explore the potential of CLIP-based methods for anomaly detection in chest X-rays. Considering the discrepancy between the CLIP pre-training data and the task-specific data, we propose a position-guided prompt learning method. Specifically, inspired by the fact that experts diagnose chest X-rays by carefully examining distinct lung regions, we propose learnable position-guided text and image prompts to adapt the task data to the frozen pre-trained CLIP-based model. To enhance the model's discriminative capability, we propose a novel structure-preserving anomaly synthesis method within chest x-rays during the training process. Extensive experiments on three datasets demonstrate that our proposed method outperforms some state-of-the-art methods. The code of our implementation is available at https://github.com/sunzc-sunny/PPAD.

* MICCAI 2024 Early Accept

Via

Access Paper or Ask Questions

WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

Mar 19, 2024

Haoyu Zhao, Yuliang Gu, Zhou Zhao, Bo Du, Yongchao Xu, Rui Yu

Figure 1 for WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

Figure 2 for WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

Figure 3 for WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

Figure 4 for WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising

Abstract:In clinical examinations and diagnoses, low-dose computed tomography (LDCT) is crucial for minimizing health risks compared with normal-dose computed tomography (NDCT). However, reducing the radiation dose compromises the signal-to-noise ratio, leading to degraded quality of CT images. To address this, we analyze LDCT denoising task based on experimental results from the frequency perspective, and then introduce a novel self-supervised CT image denoising method called WIA-LD2ND, only using NDCT data. The proposed WIA-LD2ND comprises two modules: Wavelet-based Image Alignment (WIA) and Frequency-Aware Multi-scale Loss (FAM). First, WIA is introduced to align NDCT with LDCT by mainly adding noise to the high-frequency components, which is the main difference between LDCT and NDCT. Second, to better capture high-frequency components and detailed information, Frequency-Aware Multi-scale Loss (FAM) is proposed by effectively utilizing multi-scale feature space. Extensive experiments on two public LDCT denoising datasets demonstrate that our WIA-LD2ND, only uses NDCT, outperforms existing several state-of-the-art weakly-supervised and self-supervised methods.

* 12 pages, 5 figures

Via

Access Paper or Ask Questions

Proto-MPC: An Encoder-Prototype-Decoder Approach for Quadrotor Control in Challenging Winds

Jan 27, 2024

Yuliang Gu, Sheng Cheng, Naira Hovakimyan

Figure 1 for Proto-MPC: An Encoder-Prototype-Decoder Approach for Quadrotor Control in Challenging Winds

Figure 2 for Proto-MPC: An Encoder-Prototype-Decoder Approach for Quadrotor Control in Challenging Winds

Figure 3 for Proto-MPC: An Encoder-Prototype-Decoder Approach for Quadrotor Control in Challenging Winds

Figure 4 for Proto-MPC: An Encoder-Prototype-Decoder Approach for Quadrotor Control in Challenging Winds

Abstract:Quadrotors are increasingly used in the evolving field of aerial robotics for their agility and mechanical simplicity. However, inherent uncertainties, such as aerodynamic effects coupled with quadrotors' operation in dynamically changing environments, pose significant challenges for traditional, nominal model-based control designs. We propose a multi-task meta-learning method called Encoder-Prototype-Decoder (EPD), which has the advantage of effectively balancing shared and distinctive representations across diverse training tasks. Subsequently, we integrate the EPD model into a model predictive control problem (Proto-MPC) to enhance the quadrotor's ability to adapt and operate across a spectrum of dynamically changing tasks with an efficient online implementation. We validate the proposed method in simulations, which demonstrates Proto-MPC's robust performance in trajectory tracking of a quadrotor being subject to static and spatially varying side winds.

Via

Access Paper or Ask Questions

Dual Structure-Preserving Image Filterings for Semi-supervised Medical Image Segmentation

Dec 12, 2023

Yuliang Gu, Zhichao Sun, Xin Xiao, Yuda Zou, Zelong Liu, Yongchao Xu

Abstract:Semi-supervised image segmentation has attracted great attention recently. The key is how to leverage unlabeled images in the training process. Most methods maintain consistent predictions of the unlabeled images under variations (e.g., adding noise/perturbations, or creating alternative versions) in the image and/or model level. In most image-level variation, medical images often have prior structure information, which has not been well explored. In this paper, we propose novel dual structure-preserving image filterings (DSPIF) as the image-level variations for semi-supervised medical image segmentation. Motivated by connected filtering that simplifies image via filtering in structure-aware tree-based image representation, we resort to the dual contrast invariant Max-tree and Min-tree representation. Specifically, we propose a novel connected filtering that removes topologically equivalent nodes (i.e. connected components) having no siblings in the Max/Min-tree. This results in two filtered images preserving topologically critical structure. Applying such dual structure-preserving image filterings in mutual supervision is beneficial for semi-supervised medical image segmentation. Extensive experimental results on three benchmark datasets demonstrate that the proposed method significantly/consistently outperforms some state-of-the-art methods. The source codes will be publicly available.

Via

Access Paper or Ask Questions

Synergistic Perception and Control Simplex for Verifiable Safe Vertical Landing

Dec 05, 2023

Ayoosh Bansal, Yang Zhao, James Zhu, Sheng Cheng, Yuliang Gu, Hyung-Jin Yoon, Hunmin Kim, Naira Hovakimyan, Lui Sha

Figure 1 for Synergistic Perception and Control Simplex for Verifiable Safe Vertical Landing

Figure 2 for Synergistic Perception and Control Simplex for Verifiable Safe Vertical Landing

Figure 3 for Synergistic Perception and Control Simplex for Verifiable Safe Vertical Landing

Figure 4 for Synergistic Perception and Control Simplex for Verifiable Safe Vertical Landing

Abstract:Perception, Planning, and Control form the essential components of autonomy in advanced air mobility. This work advances the holistic integration of these components to enhance the performance and robustness of the complete cyber-physical system. We adapt Perception Simplex, a system for verifiable collision avoidance amidst obstacle detection faults, to the vertical landing maneuver for autonomous air mobility vehicles. We improve upon this system by replacing static assumptions of control capabilities with dynamic confirmation, i.e., real-time confirmation of control limitations of the system, ensuring reliable fulfillment of safety maneuvers and overrides, without dependence on overly pessimistic assumptions. Parameters defining control system capabilities and limitations, e.g., maximum deceleration, are continuously tracked within the system and used to make safety-critical decisions. We apply these techniques to propose a verifiable collision avoidance solution for autonomous aerial mobility vehicles operating in cluttered and potentially unsafe environments.

* To appear in AIAA SciTech 2024

Via

Access Paper or Ask Questions