Alert button
Picture for Hanjiang Hu

Hanjiang Hu

Alert button

RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions

Oct 23, 2023
Lingdong Kong, Shaoyuan Xie, Hanjiang Hu, Lai Xing Ng, Benoit R. Cottereau, Wei Tsang Ooi

Depth estimation from monocular images is pivotal for real-world visual perception systems. While current learning-based depth estimation models train and test on meticulously curated data, they often overlook out-of-distribution (OoD) situations. Yet, in practical settings -- especially safety-critical ones like autonomous driving -- common corruptions can arise. Addressing this oversight, we introduce a comprehensive robustness test suite, RoboDepth, encompassing 18 corruptions spanning three categories: i) weather and lighting conditions; ii) sensor failures and movement; and iii) data processing anomalies. We subsequently benchmark 42 depth estimation models across indoor and outdoor scenes to assess their resilience to these corruptions. Our findings underscore that, in the absence of a dedicated robustness evaluation framework, many leading depth estimation models may be susceptible to typical corruptions. We delve into design considerations for crafting more robust depth estimation models, touching upon pre-training, augmentation, modality, model capacity, and learning paradigms. We anticipate our benchmark will establish a foundational platform for advancing robust OoD depth estimation.

* NeurIPS 2023; 45 pages, 25 figures, 13 tables; Code at https://github.com/ldkong1205/RoboDepth 
Viaarxiv icon

Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving

Oct 08, 2023
Ye Li, Hanjiang Hu, Zuxin Liu, Ding Zhao

Figure 1 for Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving
Figure 2 for Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving
Figure 3 for Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving
Figure 4 for Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving

Cameras and LiDARs are both important sensors for autonomous driving, playing critical roles for 3D object detection. Camera-LiDAR Fusion has been a prevalent solution for robust and accurate autonomous driving perception. In contrast to the vast majority of existing arts that focus on how to improve the performance of 3D target detection through cross-modal schemes, deep learning algorithms, and training tricks, we devote attention to the impact of sensor configurations on the performance of learning-based methods. To achieve this, we propose a unified information-theoretic surrogate metric for camera and LiDAR evaluation based on the proposed sensor perception model. We also design an accelerated high-quality framework for data acquisition, model training, and performance evaluation that functions with the CARLA simulator. To show the correlation between detection performance and our surrogate metrics, We conduct experiments using several camera-LiDAR placements and parameters inspired by self-driving companies and research institutions. Extensive experimental results of representative algorithms on NuScenes dataset validate the effectiveness of our surrogate metric, demonstrating that sensor configurations significantly impact point-cloud-image fusion based detection models, which contribute up to 30% discrepancy in terms of average precision.

Viaarxiv icon

Pixel-wise Smoothing for Certified Robustness against Camera Motion Perturbations

Sep 22, 2023
Hanjiang Hu, Zuxin Liu, Linyi Li, Jiacheng Zhu, Ding Zhao

In recent years, computer vision has made remarkable advancements in autonomous driving and robotics. However, it has been observed that deep learning-based visual perception models lack robustness when faced with camera motion perturbations. The current certification process for assessing robustness is costly and time-consuming due to the extensive number of image projections required for Monte Carlo sampling in the 3D camera motion space. To address these challenges, we present a novel, efficient, and practical framework for certifying the robustness of 3D-2D projective transformations against camera motion perturbations. Our approach leverages a smoothing distribution over the 2D pixel space instead of in the 3D physical space, eliminating the need for costly camera motion sampling and significantly enhancing the efficiency of robustness certifications. With the pixel-wise smoothed classifier, we are able to fully upper bound the projection errors using a technique of uniform partitioning in camera motion space. Additionally, we extend our certification framework to a more general scenario where only a single-frame point cloud is required in the projection oracle. This is achieved by deriving Lipschitz-based approximated partition intervals. Through extensive experimentation, we validate the trade-off between effectiveness and efficiency enabled by our proposed method. Remarkably, our approach achieves approximately 80% certified accuracy while utilizing only 30% of the projected image frames.

* 32 pages, 5 figures, 13 tables 
Viaarxiv icon

The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation

Jul 27, 2023
Lingdong Kong, Yaru Niu, Shaoyuan Xie, Hanjiang Hu, Lai Xing Ng, Benoit R. Cottereau, Ding Zhao, Liangjun Zhang, Hesheng Wang, Wei Tsang Ooi, Ruijie Zhu, Ziyang Song, Li Liu, Tianzhu Zhang, Jun Yu, Mohan Jing, Pengwei Li, Xiaohua Qi, Cheng Jin, Yingfeng Chen, Jie Hou, Jie Zhang, Zhen Kan, Qiang Ling, Liang Peng, Minglei Li, Di Xu, Changpeng Yang, Yuanqi Yao, Gang Wu, Jian Kuai, Xianming Liu, Junjun Jiang, Jiamian Huang, Baojun Li, Jiale Chen, Shuang Zhang, Sun Ao, Zhenyu Li, Runze Chen, Haiyong Luo, Fang Zhao, Jingze Yu

Figure 1 for The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation
Figure 2 for The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation
Figure 3 for The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation
Figure 4 for The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation

Accurate depth estimation under out-of-distribution (OoD) scenarios, such as adverse weather conditions, sensor failure, and noise contamination, is desirable for safety-critical applications. Existing depth estimation systems, however, suffer inevitably from real-world corruptions and perturbations and are struggled to provide reliable depth predictions under such cases. In this paper, we summarize the winning solutions from the RoboDepth Challenge -- an academic competition designed to facilitate and advance robust OoD depth estimation. This challenge was developed based on the newly established KITTI-C and NYUDepth2-C benchmarks. We hosted two stand-alone tracks, with an emphasis on robust self-supervised and robust fully-supervised depth estimation, respectively. Out of more than two hundred participants, nine unique and top-performing solutions have appeared, with novel designs ranging from the following aspects: spatial- and frequency-domain augmentations, masked image modeling, image restoration and super-resolution, adversarial training, diffusion-based noise suppression, vision-language pre-training, learned model ensembling, and hierarchical feature enhancement. Extensive experimental analyses along with insightful observations are drawn to better understand the rationale behind each design. We hope this challenge could lay a solid foundation for future research on robust and reliable depth estimation and beyond. The datasets, competition toolkit, workshop recordings, and source code from the winning teams are publicly available on the challenge website.

* Technical Report; 65 pages, 34 figures, 24 tables; Code at https://github.com/ldkong1205/RoboDepth 
Viaarxiv icon

Datasets and Benchmarks for Offline Safe Reinforcement Learning

Jun 16, 2023
Zuxin Liu, Zijian Guo, Haohong Lin, Yihang Yao, Jiacheng Zhu, Zhepeng Cen, Hanjiang Hu, Wenhao Yu, Tingnan Zhang, Jie Tan, Ding Zhao

Figure 1 for Datasets and Benchmarks for Offline Safe Reinforcement Learning
Figure 2 for Datasets and Benchmarks for Offline Safe Reinforcement Learning
Figure 3 for Datasets and Benchmarks for Offline Safe Reinforcement Learning
Figure 4 for Datasets and Benchmarks for Offline Safe Reinforcement Learning

This paper presents a comprehensive benchmarking suite tailored to offline safe reinforcement learning (RL) challenges, aiming to foster progress in the development and evaluation of safe learning algorithms in both the training and deployment phases. Our benchmark suite contains three packages: 1) expertly crafted safe policies, 2) D4RL-styled datasets along with environment wrappers, and 3) high-quality offline safe RL baseline implementations. We feature a methodical data collection pipeline powered by advanced safe RL algorithms, which facilitates the generation of diverse datasets across 38 popular safe RL tasks, from robot control to autonomous driving. We further introduce an array of data post-processing filters, capable of modifying each dataset's diversity, thereby simulating various data collection conditions. Additionally, we provide elegant and extensible implementations of prevalent offline safe RL algorithms to accelerate research in this area. Through extensive experiments with over 50000 CPU and 800 GPU hours of computations, we evaluate and compare the performance of these baseline algorithms on the collected datasets, offering insights into their strengths, limitations, and potential areas of improvement. Our benchmarking framework serves as a valuable resource for researchers and practitioners, facilitating the development of more robust and reliable offline safe RL solutions in safety-critical applications. The benchmark website is available at \url{www.offline-saferl.org}.

* 22 pages.13 figures, 7 tables 
Viaarxiv icon

Robustness Certification of Visual Perception Models via Camera Motion Smoothing

Oct 04, 2022
Hanjiang Hu, Zuxin Liu, Linyi Li, Jiacheng Zhu, Ding Zhao

Figure 1 for Robustness Certification of Visual Perception Models via Camera Motion Smoothing
Figure 2 for Robustness Certification of Visual Perception Models via Camera Motion Smoothing
Figure 3 for Robustness Certification of Visual Perception Models via Camera Motion Smoothing
Figure 4 for Robustness Certification of Visual Perception Models via Camera Motion Smoothing

A vast literature shows that the learning-based visual perception model is sensitive to adversarial noises but few works consider the robustness of robotic perception models under widely-existing camera motion perturbations. To this end, we study the robustness of the visual perception model under camera motion perturbations to investigate the influence of camera motion on robotic perception. Specifically, we propose a motion smoothing technique for arbitrary image classification models, whose robustness under camera motion perturbations could be certified. The proposed robustness certification framework based on camera motion smoothing provides tight and scalable robustness guarantees for visual perception modules so that they are applicable to wide robotic applications. As far as we are aware, this is the first work to provide the robustness certification for the deep perception module against camera motions, which improves the trustworthiness of robotic perception. A realistic indoor robotic dataset with the dense point cloud map for the entire room, MetaRoom, is introduced for the challenging certifiable robust perception task. We conduct extensive experiments to validate the certification approach via motion smoothing against camera motion perturbations. Our framework guarantees the certified accuracy of 81.7% against camera translation perturbation along depth direction within -0.1m ` 0.1m. We also validate the effectiveness of our method on the real-world robot by conducting hardware experiment on the robotic arm with an eye-in-hand camera. The code is available on https://github.com/HanjiangHu/camera-motion-smoothing.

* CoRL 2022, 20 pages, 7 figures, 8 tables 
Viaarxiv icon

SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

Jun 20, 2022
Chejian Xu, Wenhao Ding, Weijie Lyu, Zuxin Liu, Shuai Wang, Yihan He, Hanjiang Hu, Ding Zhao, Bo Li

Figure 1 for SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles
Figure 2 for SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles
Figure 3 for SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles
Figure 4 for SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

As shown by recent studies, machine intelligence-enabled systems are vulnerable to test cases resulting from either adversarial manipulation or natural distribution shifts. This has raised great concerns about deploying machine learning algorithms for real-world applications, especially in the safety-critical domains such as autonomous driving (AD). On the other hand, traditional AD testing on naturalistic scenarios requires hundreds of millions of driving miles due to the high dimensionality and rareness of the safety-critical scenarios in the real world. As a result, several approaches for autonomous driving evaluation have been explored, which are usually, however, based on different simulation platforms, types of safety-critical scenarios, scenario generation algorithms, and driving route variations. Thus, despite a large amount of effort in autonomous driving testing, it is still challenging to compare and understand the effectiveness and efficiency of different testing scenario generation algorithms and testing mechanisms under similar conditions. In this paper, we aim to provide the first unified platform SafeBench to integrate different types of safety-critical testing scenarios, scenario generation algorithms, and other variations such as driving routes and environments. Meanwhile, we implement 4 deep reinforcement learning-based AD algorithms with 4 types of input (e.g., bird's-eye view, camera) to perform fair comparisons on SafeBench. We find our generated testing scenarios are indeed more challenging and observe the trade-off between the performance of AD agents under benign and safety-critical testing scenarios. We believe our unified platform SafeBench for large-scale and effective autonomous driving testing will motivate the development of new testing scenario generation and safe AD algorithms. SafeBench is available at https://safebench.github.io.

Viaarxiv icon

vLPD-Net: A Registration-aided Domain Adaptation Network for 3D Point Cloud Based Place Recognition

Dec 09, 2020
Zhijian Qiao, Hanjiang Hu, Siyuan Chen, Zhe Liu, Zhuowen Shen, Hesheng Wang

Figure 1 for vLPD-Net: A Registration-aided Domain Adaptation Network for 3D Point Cloud Based Place Recognition
Figure 2 for vLPD-Net: A Registration-aided Domain Adaptation Network for 3D Point Cloud Based Place Recognition
Figure 3 for vLPD-Net: A Registration-aided Domain Adaptation Network for 3D Point Cloud Based Place Recognition
Figure 4 for vLPD-Net: A Registration-aided Domain Adaptation Network for 3D Point Cloud Based Place Recognition

In the field of large-scale SLAM for autonomous driving and mobile robotics, 3D point cloud based place recognition has aroused significant research interest due to its robustness to changing environments with drastic daytime and weather variance. However, it is time-consuming and effort-costly to obtain high-quality point cloud data and groundtruth for registration and place recognition model training in the real world. To this end, a novel registration-aided 3D domain adaptation network for point cloud based place recognition is proposed. A structure-aware registration network is introduced to help learn feature from geometric properties and a matching rate based triplet loss is involved for metric learning. The model is trained through a new virtual LiDAR dataset through GTA-V with diverse weather and daytime conditions and domain adaptation is implemented to the real-world domain by aligning the local and global features. Extensive experiments have been conducted to validate the effectiveness of the structure-aware registration network and domain adaptation. Our results outperform state-of-the-art 3D place recognition baselines on the real-world Oxford RobotCar dataset with the visualization of large-scale registration on the virtual dataset.

Viaarxiv icon

SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments

Nov 09, 2020
Hanjiang Hu, Baoquan Yang, Weiang Shi, Zhijian Qiao, Hesheng Wang

Figure 1 for SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments
Figure 2 for SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments
Figure 3 for SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments
Figure 4 for SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments

Monocular depth prediction has been well studied recently, while there are few works focused on the depth prediction across multiple environments, e.g. changing illumination and seasons, owing to the lack of such real-world dataset and benchmark. In this work, we derive a new cross-season scaleless monocular depth prediction dataset SeasonDepth from CMU Visual Localization dataset through structure from motion. And then we formulate several metrics to benchmark the performance under different environments using recent stateof-the-art open-source depth prediction pretrained models from KITTI benchmark. Through extensive zero-shot experimental evaluation on the proposed dataset, we show that the long-term monocular depth prediction is far from solved and provide promising solutions in the future work, including geometricbased or scale-invariant training. Moreover, multi-environment synthetic dataset and cross-dataset validataion are beneficial to the robustness to real-world environmental variance.

* Submitted to ICRA 2021 
Viaarxiv icon

DASGIL: Domain Adaptation for Semantic and Geometric-aware Image-based Localization

Oct 01, 2020
Hanjiang Hu, Ming Cheng, Zhe Liu, Hesheng Wang

Figure 1 for DASGIL: Domain Adaptation for Semantic and Geometric-aware Image-based Localization
Figure 2 for DASGIL: Domain Adaptation for Semantic and Geometric-aware Image-based Localization
Figure 3 for DASGIL: Domain Adaptation for Semantic and Geometric-aware Image-based Localization
Figure 4 for DASGIL: Domain Adaptation for Semantic and Geometric-aware Image-based Localization

Long-Term visual localization under changing environments is a challenging problem in autonomous driving and mobile robotics due to season, illumination variance, etc. Image retrieval for localization is an efficient and effective solution to the problem. In this paper, we propose a novel multi-task architecture to fuse the geometric and semantic information into the multi-scale latent embedding representation for visual place recognition. To use the high-quality ground truths without any human effort, depth and segmentation generator model is trained on virtual synthetic dataset and domain adaptation is adopted from synthetic to real-world dataset. The multi-scale model presents the strong generalization ability on real-world KITTI dataset though trained on the virtual KITTI 2 dataset. The proposed approach is validated on the Extended CMU-Seasons dataset through a series of crucial comparison experiments, where our performance outperforms state-of-the-art baselines for retrieval-based localization under the challenging environment.

* Submitted to TIP 
Viaarxiv icon