Alert button
Picture for Rong Xiong

Rong Xiong

Alert button

CNS: Correspondence Encoded Neural Image Servo Policy

Sep 16, 2023
Anzhe Chen, Hongxiang Yu, Yue Wang, Rong Xiong

Image servo is an indispensable technique in robotic applications that helps to achieve high precision positioning. The intermediate representation of image servo policy is important to sensor input abstraction and policy output guidance. Classical approaches achieve high precision but require clean keypoint correspondence, and suffer from limited convergence basin or weak feature error robustness. Recent learning-based methods achieve moderate precision and large convergence basin on specific scenes but face issues when generalizing to novel environments. In this paper, we encode keypoints and correspondence into a graph and use graph neural network as architecture of controller. This design utilizes both advantages: generalizable intermediate representation from keypoint correspondence and strong modeling ability from neural network. Other techniques including realistic data generation, feature clustering and distance decoupling are proposed to further improve efficiency, precision and generalization. Experiments in simulation and real-world verify the effectiveness of our method in speed (maximum 40fps along with observer), precision (<0.3{\deg} and sub-millimeter accuracy) and generalization (sim-to-real without fine-tuning). Project homepage (full paper with supplementary text, video and code): https://hhcaz.github.io/CNS-home

Viaarxiv icon

3D Model-free Visual Localization System from Essential Matrix under Local Planar Motion

Sep 04, 2023
Yanmei Jiao, Binxin Zhang, Peng Jiang, Chaoqun Wang, Rong Xiong, Yue Wang

Figure 1 for 3D Model-free Visual Localization System from Essential Matrix under Local Planar Motion
Figure 2 for 3D Model-free Visual Localization System from Essential Matrix under Local Planar Motion
Figure 3 for 3D Model-free Visual Localization System from Essential Matrix under Local Planar Motion
Figure 4 for 3D Model-free Visual Localization System from Essential Matrix under Local Planar Motion

Visual localization plays a critical role in the functionality of low-cost autonomous mobile robots. Current state-of-the-art approaches for achieving accurate visual localization are 3D scene-specific, requiring additional computational and storage resources to construct a 3D scene model when facing a new environment. An alternative approach of directly using a database of 2D images for visual localization offers more flexibility. However, such methods currently suffer from limited localization accuracy. In this paper, we propose an accurate and robust multiple checking-based 3D model-free visual localization system to address the aforementioned issues. To ensure high accuracy, our focus is on estimating the pose of a query image relative to the retrieved database images using 2D-2D feature matches. Theoretically, by incorporating the local planar motion constraint into both the estimation of the essential matrix and the triangulation stages, we reduce the minimum required feature matches for absolute pose estimation, thereby enhancing the robustness of outlier rejection. Additionally, we introduce a multiple-checking mechanism to ensure the correctness of the solution throughout the solving process. For validation, qualitative and quantitative experiments are performed on both simulation and two real-world datasets and the experimental results demonstrate a significant enhancement in both accuracy and robustness afforded by the proposed 3D model-free visual localization system.

Viaarxiv icon

Sparse Waypoint Validity Checking for Self-Entanglement-Free Tethered Path Planning

Aug 30, 2023
Tong Yang, Jiangpin Liu, Yue Wang, Rong Xiong

Figure 1 for Sparse Waypoint Validity Checking for Self-Entanglement-Free Tethered Path Planning
Figure 2 for Sparse Waypoint Validity Checking for Self-Entanglement-Free Tethered Path Planning
Figure 3 for Sparse Waypoint Validity Checking for Self-Entanglement-Free Tethered Path Planning
Figure 4 for Sparse Waypoint Validity Checking for Self-Entanglement-Free Tethered Path Planning

A novel mechanism to derive self-entanglement-free (SEF) path for tethered differential-driven robots is proposed in this work. The problem is tailored to the deployment of tethered differential-driven robots in situations where an omni-directional tether re-tractor is not available. This is frequently encountered when it is impractical to concurrently equip an omni-directional tether retracting mechanism with other geometrically intricate devices, such as a manipulator, which is notably relevant in applications like disaster recovery, spatial exploration, etc. Without specific attention to the spatial relation between the shape of the tether and the pose of the mobile unit, the issue of self-entanglement arises when the robot moves, resulting in unsafe robot movements and the risk of damaging the tether. In this paper, the SEF constraint is first formulated as the boundedness of a relative angle function which characterises the angular difference between the tether stretching direction and the robot's heading direction. Then, a constrained searching-based path planning algorithm is proposed which produces a path that is sub-optimal whilst ensuring the avoidance of tether self-entanglement. Finally, the algorithmic efficiency of the proposed path planner is further enhanced by proving the conditioned sparsity of the primitive path validity checking module. The effectiveness of the proposed algorithm is assessed through case studies, comparing its performance against untethered differential-driven planners in challenging planning scenarios. A comparative analysis is further conducted between the normal node expansion module and the improved node expansion module which incorporates sparse waypoint validity checking. Real-world tests are also conducted to validate the algorithm's performance. An open-source implementation has also made available for the benefit of the robotics community.

* This is a generalised version of the authors' ICRA23 conference paper 
Viaarxiv icon

3D Model-free Visual localization System from Essential Matrix under Local Planar Motion

Aug 18, 2023
Yanmei Jiao, Binxin Zhang, Peng Jiang, Rong Xiong, Yue Wang

Figure 1 for 3D Model-free Visual localization System from Essential Matrix under Local Planar Motion
Figure 2 for 3D Model-free Visual localization System from Essential Matrix under Local Planar Motion
Figure 3 for 3D Model-free Visual localization System from Essential Matrix under Local Planar Motion
Figure 4 for 3D Model-free Visual localization System from Essential Matrix under Local Planar Motion

Visual localization plays a critical role in the functionality of low-cost autonomous mobile robots. Current state-of-the-art approaches to accurate visual localization are 3D scene-specific, requiring additional computational and storage resources to construct a 3D scene model when facing a new environment. An alternative approach of directly using a database of 2D images for visual localization offers more flexibility. However, such methods currently suffer from limited localization accuracy. In this paper, we propose a robust and accurate multiple checking-based 3D model-free visual localization system that addresses the aforementioned issues. The core idea is to model the local planar motion characteristic of general ground-moving robots into both essential matrix estimation and triangulation stages to obtain two minimal solutions. By embedding the proposed minimal solutions into the multiple checking scheme, the proposed 3D model-free visual localization framework demonstrates high accuracy and robustness in both simulation and real-world experiments.

Viaarxiv icon

Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction

Aug 17, 2023
Yuhao Yang, Jun Wu, Guangjian Zhang, Rong Xiong

Figure 1 for Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction
Figure 2 for Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction
Figure 3 for Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction
Figure 4 for Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction

Traditional geometric registration based estimation methods only exploit the CAD model implicitly, which leads to their dependence on observation quality and deficiency to occlusion. To address the problem,the paper proposes a bidirectional correspondence prediction network with a point-wise attention-aware mechanism. This network not only requires the model points to predict the correspondence but also explicitly models the geometric similarities between observations and the model prior. Our key insight is that the correlations between each model point and scene point provide essential information for learning point-pair matches. To further tackle the correlation noises brought by feature distribution divergence, we design a simple but effective pseudo-siamese network to improve feature homogeneity. Experimental results on the public datasets of LineMOD, YCB-Video, and Occ-LineMOD show that the proposed method achieves better performance than other state-of-the-art methods under the same evaluation criteria. Its robustness in estimating poses is greatly improved, especially in an environment with severe occlusions.

Viaarxiv icon

Leveraging BEV Representation for 360-degree Visual Place Recognition

May 23, 2023
Xuecheng Xu, Yanmei Jiao, Sha Lu, Xiaqing Ding, Rong Xiong, Yue Wang

Figure 1 for Leveraging BEV Representation for 360-degree Visual Place Recognition
Figure 2 for Leveraging BEV Representation for 360-degree Visual Place Recognition
Figure 3 for Leveraging BEV Representation for 360-degree Visual Place Recognition
Figure 4 for Leveraging BEV Representation for 360-degree Visual Place Recognition

This paper investigates the advantages of using Bird's Eye View (BEV) representation in 360-degree visual place recognition (VPR). We propose a novel network architecture that utilizes the BEV representation in feature extraction, feature aggregation, and vision-LiDAR fusion, which bridges visual cues and spatial awareness. Our method extracts image features using standard convolutional networks and combines the features according to pre-defined 3D grid spatial points. To alleviate the mechanical and time misalignments between cameras, we further introduce deformable attention to learn the compensation. Upon the BEV feature representation, we then employ the polar transform and the Discrete Fourier transform for aggregation, which is shown to be rotation-invariant. In addition, the image and point cloud cues can be easily stated in the same coordinates, which benefits sensor fusion for place recognition. The proposed BEV-based method is evaluated in ablation and comparative studies on two datasets, including on-the-road and off-the-road scenarios. The experimental results verify the hypothesis that BEV can benefit VPR by its superior performance compared to baseline methods. To the best of our knowledge, this is the first trial of employing BEV representation in this task.

Viaarxiv icon

An Efficient Multi-solution Solver for the Inverse Kinematics of 3-Section Constant-Curvature Robots

May 02, 2023
Ke Qiu, Jingyu Zhang, Danying Sun, Rong Xiong, Haojian Lu, Yue Wang

Figure 1 for An Efficient Multi-solution Solver for the Inverse Kinematics of 3-Section Constant-Curvature Robots
Figure 2 for An Efficient Multi-solution Solver for the Inverse Kinematics of 3-Section Constant-Curvature Robots
Figure 3 for An Efficient Multi-solution Solver for the Inverse Kinematics of 3-Section Constant-Curvature Robots
Figure 4 for An Efficient Multi-solution Solver for the Inverse Kinematics of 3-Section Constant-Curvature Robots

Piecewise constant curvature is a popular kinematics framework for continuum robots. Computing the model parameters from the desired end pose, known as the inverse kinematics problem, is fundamental in manipulation, tracking and planning tasks. In this paper, we propose an efficient multi-solution solver to address the inverse kinematics problem of 3-section constant-curvature robots by bridging both the theoretical reduction and numerical correction. We derive analytical conditions to simplify the original problem into a one-dimensional problem. Further, the equivalence of the two problems is formalised. In addition, we introduce an approximation with bounded error so that the one dimension becomes traversable while the remaining parameters analytically solvable. With the theoretical results, the global search and numerical correction are employed to implement the solver. The experiments validate the better efficiency and higher success rate of our solver than the numerical methods when one solution is required, and demonstrate the ability of obtaining multiple solutions with optimal path planning in a space with obstacles.

* Robotics: Science and Systems 2023 
Viaarxiv icon

Learning adaptive manipulation of objects with revolute joint: A case study on varied cabinet doors opening

Apr 28, 2023
Hongxiang Yu, Dashun Guo, Zhongxiang Zhou, Yue Wang, Rong Xiong

Figure 1 for Learning adaptive manipulation of objects with revolute joint: A case study on varied cabinet doors opening
Figure 2 for Learning adaptive manipulation of objects with revolute joint: A case study on varied cabinet doors opening
Figure 3 for Learning adaptive manipulation of objects with revolute joint: A case study on varied cabinet doors opening
Figure 4 for Learning adaptive manipulation of objects with revolute joint: A case study on varied cabinet doors opening

This paper introduces a learning-based framework for robot adaptive manipulating the object with a revolute joint in unstructured environments. We concentrate our discussion on various cabinet door opening tasks. To improve the performance of Deep Reinforcement Learning in this scene, we analytically provide an efficient sampling manner utilizing the constraints of the objects. To open various kinds of doors, we add encoded environment parameters that define the various environments to the input of out policy. To transfer the policy into the real world, we train an adaptation module in simulation and fine-tune the adaptation module to cut down the impact of the policy-unaware environment parameters. We design a series of experiments to validate the efficacy of our framework. Additionally, we testify to the model's performance in the real world compared to the traditional door opening method.

Viaarxiv icon

Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow

Apr 25, 2023
Dongkun Zhang, Jintao Xue, Yuxiang Cui, Yunkai Wang, Eryun Liu, Wei Jing, Junbo Chen, Rong Xiong, Yue Wang

Figure 1 for Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow
Figure 2 for Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow
Figure 3 for Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow
Figure 4 for Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow

Acquiring driving policies that can transfer to unseen environments is challenging when driving in dense traffic flows. The design of traffic flow is essential and previous studies are unable to balance interaction and safety-criticism. To tackle this problem, we propose a socially adversarial traffic flow. We propose a Contextual Partially-Observable Stochastic Game to model traffic flow and assign Social Value Orientation (SVO) as context. We then adopt a two-stage framework. In Stage 1, each agent in our socially-aware traffic flow is driven by a hierarchical policy where upper-level policy communicates genuine SVOs of all agents, which the lower-level policy takes as input. In Stage 2, each agent in the socially adversarial traffic flow is driven by the hierarchical policy where upper-level communicates mistaken SVOs, taken by the lower-level policy trained in Stage 1. Driving policy is adversarially trained through a zero-sum game formulation with upper-level policies, resulting in a policy with enhanced zero-shot transfer capability to unseen traffic flows. Comprehensive experiments on cross-validation verify the superior zero-shot transfer performance of our method.

Viaarxiv icon

A Hyper-network Based End-to-end Visual Servoing with Arbitrary Desired Poses

Apr 18, 2023
Hongxiang Yu, Anzhe Chen, Kechun Xu, Zhongxiang Zhou, Wei Jing, Yue Wang, Rong Xiong

Figure 1 for A Hyper-network Based End-to-end Visual Servoing with Arbitrary Desired Poses
Figure 2 for A Hyper-network Based End-to-end Visual Servoing with Arbitrary Desired Poses
Figure 3 for A Hyper-network Based End-to-end Visual Servoing with Arbitrary Desired Poses
Figure 4 for A Hyper-network Based End-to-end Visual Servoing with Arbitrary Desired Poses

Recently, several works achieve end-to-end visual servoing (VS) for robotic manipulation by replacing traditional controller with differentiable neural networks, but lose the ability to servo arbitrary desired poses. This letter proposes a differentiable architecture for arbitrary pose servoing: a hyper-network based neural controller (HPN-NC). To achieve this, HPN-NC consists of a hyper net and a low-level controller, where the hyper net learns to generate the parameters of the low-level controller and the controller uses the 2D keypoints error for control like traditional image-based visual servoing (IBVS). HPN-NC can complete 6 degree of freedom visual servoing with large initial offset. Taking advantage of the fully differentiable nature of HPN-NC, we provide a three-stage training procedure to servo real world objects. With self-supervised end-to-end training, the performance of the integrated model can be further improved in unseen scenes and the amount of manual annotations can be significantly reduced.

Viaarxiv icon