Alert button
Picture for Ziyang Hong

Ziyang Hong

Alert button

Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video

Mar 16, 2023
Ziyang Hong, C. Patrick Yue

Figure 1 for Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video
Figure 2 for Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video
Figure 3 for Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video
Figure 4 for Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video

We present a novel real-time capable learning method that jointly perceives a 3D scene's geometry structure and semantic labels. Recent approaches to real-time 3D scene reconstruction mostly adopt a volumetric scheme, where a truncated signed distance function (TSDF) is directly regressed. However, these volumetric approaches tend to focus on the global coherence of their reconstructions, which leads to a lack of local geometrical detail. To overcome this issue, we propose to leverage the latent geometrical prior knowledge in 2D image features by explicit depth prediction and anchored feature generation, to refine the occupancy learning in TSDF volume. Besides, we find that this cross-dimensional feature refinement methodology can also be adopted for the semantic segmentation task. Hence, we proposed an end-to-end cross-dimensional refinement neural network (CDRNet) to extract both 3D mesh and 3D semantic labeling in real time. The experiment results show that the proposed method achieves state-of-the-art 3D perception efficiency on multiple datasets, which indicates the great potential of our method for industrial applications.

Viaarxiv icon

CURL: Continuous, Ultra-compact Representation for LiDAR

May 12, 2022
Kaicheng Zhang, Ziyang Hong, Shida Xu, Sen Wang

Figure 1 for CURL: Continuous, Ultra-compact Representation for LiDAR
Figure 2 for CURL: Continuous, Ultra-compact Representation for LiDAR
Figure 3 for CURL: Continuous, Ultra-compact Representation for LiDAR
Figure 4 for CURL: Continuous, Ultra-compact Representation for LiDAR

Increasing the density of the 3D LiDAR point cloud is appealing for many applications in robotics. However, high-density LiDAR sensors are usually costly and still limited to a level of coverage per scan (e.g., 128 channels). Meanwhile, denser point cloud scans and maps mean larger volumes to store and longer times to transmit. Existing works focus on either improving point cloud density or compressing its size. This paper aims to design a novel 3D point cloud representation that can continuously increase point cloud density while reducing its storage and transmitting size. The pipeline of the proposed Continuous, Ultra-compact Representation of LiDAR (CURL) includes four main steps: meshing, upsampling, encoding, and continuous reconstruction. It is capable of transforming a 3D LiDAR scan or map into a compact spherical harmonics representation which can be used or transmitted in low latency to continuously reconstruct a much denser 3D point cloud. Extensive experiments on four public datasets, covering college gardens, city streets, and indoor rooms, demonstrate that much denser 3D point clouds can be accurately reconstructed using the proposed CURL representation while achieving up to 80% storage space-saving. We open-source the CURL codes for the community.

* Robotics: Science and Systems (RSS), 2022  
Viaarxiv icon

Radar SLAM: A Robust SLAM System for All Weather Conditions

Apr 12, 2021
Ziyang Hong, Yvan Petillot, Andrew Wallace, Sen Wang

Figure 1 for Radar SLAM: A Robust SLAM System for All Weather Conditions
Figure 2 for Radar SLAM: A Robust SLAM System for All Weather Conditions
Figure 3 for Radar SLAM: A Robust SLAM System for All Weather Conditions
Figure 4 for Radar SLAM: A Robust SLAM System for All Weather Conditions

A Simultaneous Localization and Mapping (SLAM) system must be robust to support long-term mobile vehicle and robot applications. However, camera and LiDAR based SLAM systems can be fragile when facing challenging illumination or weather conditions which degrade their imagery and point cloud data. Radar, whose operating electromagnetic spectrum is less affected by environmental changes, is promising although its distinct sensing geometry and noise characteristics bring open challenges when being exploited for SLAM. % However, there are still open challenges since most existing visual and LiDAR SLAM systems do not operate in bad weathers. This paper studies the use of a Frequency Modulated Continuous Wave radar for SLAM in large-scale outdoor environments. We propose a full radar SLAM system, including a novel radar motion tracking algorithm that leverages radar geometry for reliable feature tracking. It also optimally compensates motion distortion and estimates pose by joint optimization. Its loop closure component is designed to be simple yet efficient for radar imagery by capturing and exploiting structural information of the surrounding environment. % while a scheme to reject ambiguous loop closure candidates is also designed specifically for radar. Extensive experiments on three public radar datasets, ranging from city streets and residential areas to countryside and highways, show competitive accuracy and reliability performance of the proposed radar SLAM system compared to the state-of-the-art LiDAR, vision and radar methods. The results show that our system is technically viable in achieving reliable SLAM in extreme weather conditions, e.g. heavy snow and dense fog, demonstrating the promising potential of using radar for all-weather localization and mapping.

* Under review 
Viaarxiv icon

Efficient Training Convolutional Neural Networks on Edge Devices with Gradient-pruned Sign-symmetric Feedback Alignment

Mar 04, 2021
Ziyang Hong, C. Patrick Yue

Figure 1 for Efficient Training Convolutional Neural Networks on Edge Devices with Gradient-pruned Sign-symmetric Feedback Alignment
Figure 2 for Efficient Training Convolutional Neural Networks on Edge Devices with Gradient-pruned Sign-symmetric Feedback Alignment
Figure 3 for Efficient Training Convolutional Neural Networks on Edge Devices with Gradient-pruned Sign-symmetric Feedback Alignment
Figure 4 for Efficient Training Convolutional Neural Networks on Edge Devices with Gradient-pruned Sign-symmetric Feedback Alignment

With the prosperity of mobile devices, the distributed learning approach enabling model training with decentralized data has attracted wide research. However, the lack of training capability for edge devices significantly limits the energy efficiency of distributed learning in real life. This paper describes a novel approach of training DNNs exploiting the redundancy and the weight asymmetry potential of conventional backpropagation. We demonstrate that with negligible classification accuracy loss, the proposed approach outperforms the prior arts by 5x in terms of energy efficiency.

Viaarxiv icon

Multi-Task Reinforcement Learning based Mobile Manipulation Control for Dynamic Object Tracking and Grasping

Jun 07, 2020
Cong Wang, Qifeng Zhang, Qiyan Tian, Shuo Li, Xiaohui Wang, David Lane, Yvan Petillot, Ziyang Hong, Sen Wang

Figure 1 for Multi-Task Reinforcement Learning based Mobile Manipulation Control for Dynamic Object Tracking and Grasping
Figure 2 for Multi-Task Reinforcement Learning based Mobile Manipulation Control for Dynamic Object Tracking and Grasping
Figure 3 for Multi-Task Reinforcement Learning based Mobile Manipulation Control for Dynamic Object Tracking and Grasping
Figure 4 for Multi-Task Reinforcement Learning based Mobile Manipulation Control for Dynamic Object Tracking and Grasping

Agile control of mobile manipulator is challenging because of the high complexity coupled by the robotic system and the unstructured working environment. Tracking and grasping a dynamic object with a random trajectory is even harder. In this paper, a multi-task reinforcement learning-based mobile manipulation control framework is proposed to achieve general dynamic object tracking and grasping. Several basic types of dynamic trajectories are chosen as the task training set. To improve the policy generalization in practice, random noise and dynamics randomization are introduced during the training process. Extensive experiments show that our policy trained can adapt to unseen random dynamic trajectories with about 0.1m tracking error and 75\% grasping success rate of dynamic objects. The trained policy can also be successfully deployed on a real mobile manipulator.

* 6 pages, 7 figures, submitted to IROS2020 
Viaarxiv icon

RadarSLAM: Radar based Large-Scale SLAM in All Weathers

May 05, 2020
Ziyang Hong, Yvan Petillot, Sen Wang

Figure 1 for RadarSLAM: Radar based Large-Scale SLAM in All Weathers
Figure 2 for RadarSLAM: Radar based Large-Scale SLAM in All Weathers
Figure 3 for RadarSLAM: Radar based Large-Scale SLAM in All Weathers
Figure 4 for RadarSLAM: Radar based Large-Scale SLAM in All Weathers

Numerous Simultaneous Localization and Mapping (SLAM) algorithms have been presented in last decade using different sensor modalities. However, robust SLAM in extreme weather conditions is still an open research problem. In this paper, RadarSLAM, a full radar based graph SLAM system, is proposed for reliable localization and mapping in large-scale environments. It is composed of pose tracking, local mapping, loop closure detection and pose graph optimization, enhanced by novel feature matching and probabilistic point cloud generation on radar images. Extensive experiments are conducted on a public radar dataset and several self-collected radar sequences, demonstrating the state-of-the-art reliability and localization accuracy in various adverse weather conditions, such as dark night, dense fog and heavy snowfall.

Viaarxiv icon