Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Taekyung Kim

METAVerse: Meta-Learning Traversability Cost Map for Off-Road Navigation

Jul 26, 2023

Junwon Seo, Taekyung Kim, Seongyong Ahn, Kiho Kwak

Abstract:Autonomous navigation in off-road conditions requires an accurate estimation of terrain traversability. However, traversability estimation in unstructured environments is subject to high uncertainty due to the variability of numerous factors that influence vehicle-terrain interaction. Consequently, it is challenging to obtain a generalizable model that can accurately predict traversability in a variety of environments. This paper presents METAVerse, a meta-learning framework for learning a global model that accurately and reliably predicts terrain traversability across diverse environments. We train the traversability prediction network to generate a dense and continuous-valued cost map from a sparse LiDAR point cloud, leveraging vehicle-terrain interaction feedback in a self-supervised manner. Meta-learning is utilized to train a global model with driving data collected from multiple environments, effectively minimizing estimation uncertainty. During deployment, online adaptation is performed to rapidly adapt the network to the local environment by exploiting recent interaction experiences. To conduct a comprehensive evaluation, we collect driving data from various terrains and demonstrate that our method can obtain a global model that minimizes uncertainty. Moreover, by integrating our model with a model predictive controller, we demonstrate that the reduced uncertainty results in safe and stable navigation in unstructured and unknown terrains.

* Our video can be found at https://youtu.be/4rIAMM1ZKMo

Via

Access Paper or Ask Questions

Safe Navigation in Unstructured Environments by Minimizing Uncertainty in Control and Perception

Jun 26, 2023

Junwon Seo, Jungwi Mun, Taekyung Kim

Figure 1 for Safe Navigation in Unstructured Environments by Minimizing Uncertainty in Control and Perception

Figure 2 for Safe Navigation in Unstructured Environments by Minimizing Uncertainty in Control and Perception

Figure 3 for Safe Navigation in Unstructured Environments by Minimizing Uncertainty in Control and Perception

Figure 4 for Safe Navigation in Unstructured Environments by Minimizing Uncertainty in Control and Perception

Abstract:Uncertainty in control and perception poses challenges for autonomous vehicle navigation in unstructured environments, leading to navigation failures and potential vehicle damage. This paper introduces a framework that minimizes control and perception uncertainty to ensure safe and reliable navigation. The framework consists of two uncertainty-aware models: a learning-based vehicle dynamics model and a self-supervised traversability estimation model. We train a vehicle dynamics model that can quantify the epistemic uncertainty of the model to perform active exploration, resulting in the efficient collection of training data and effective avoidance of uncertain state-action spaces. In addition, we employ meta-learning to train a traversability cost prediction network. The model can be trained with driving data from a variety of types of terrain, and it can online-adapt based on interaction experiences to reduce the aleatoric uncertainty. Integrating the dynamics model and traversability cost prediction model with a sampling-based model predictive controller allows for optimizing trajectories that avoid uncertain terrains and state-action spaces. Experimental results demonstrate that the proposed method reduces uncertainty in prediction and improves stability in autonomous vehicle navigation in unstructured environments.

* RSS 2023 Workshop on Inference and Decision Making for Autonomous Vehicles (IDMAV)

Via

Access Paper or Ask Questions

Augmenting Sub-model to Improve Main Model

Jun 20, 2023

Byeongho Heo, Taekyung Kim, Sangdoo Yun, Dongyoon Han

Figure 1 for Augmenting Sub-model to Improve Main Model

Figure 2 for Augmenting Sub-model to Improve Main Model

Figure 3 for Augmenting Sub-model to Improve Main Model

Figure 4 for Augmenting Sub-model to Improve Main Model

Abstract:Image classification has improved with the development of training techniques. However, these techniques often require careful parameter tuning to balance the strength of regularization, limiting their potential benefits. In this paper, we propose a novel way to use regularization called Augmenting Sub-model (AugSub). AugSub consists of two models: the main model and the sub-model. While the main model employs conventional training recipes, the sub-model leverages the benefit of additional regularization. AugSub achieves this by mitigating adverse effects through a relaxed loss function similar to self-distillation loss. We demonstrate the effectiveness of AugSub with three drop techniques: dropout, drop-path, and random masking. Our analysis shows that all AugSub improves performance, with the training loss converging even faster than regular training. Among the three, AugMask is identified as the most practical method due to its performance and cost efficiency. We further validate AugMask across diverse training recipes, including DeiT-III, ResNet, MAE fine-tuning, and Swin Transformer. The results show that AugMask consistently provides significant performance gain. AugSub provides a practical and effective solution for introducing additional regularization under various training recipes. Code is available at \url{https://github.com/naver-ai/augsub}.

* 15 pages, 3 figures

Via

Access Paper or Ask Questions

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics

May 20, 2023

Taekyung Kim, Jungwi Mun, Junwon Seo, Beomsu Kim, Seongil Hong

Abstract:In recent years, learning-based control in robotics has gained significant attention due to its capability to address complex tasks in real-world environments. With the advances in machine learning algorithms and computational capabilities, this approach is becoming increasingly important for solving challenging control problems in robotics by learning unknown or partially known robot dynamics. Active exploration, in which a robot directs itself to states that yield the highest information gain, is essential for efficient data collection and minimizing human supervision. Similarly, uncertainty-aware deployment has been a growing concern in robotic control, as uncertain actions informed by the learned model can lead to unstable motions or failure. However, active exploration and uncertainty-aware deployment have been studied independently, and there is limited literature that seamlessly integrates them. This paper presents a unified model-based reinforcement learning framework that bridges these two tasks in the robotics control domain. Our framework uses a probabilistic ensemble neural network for dynamics learning, allowing the quantification of epistemic uncertainty via Jensen-Renyi Divergence. The two opposing tasks of exploration and deployment are optimized through state-of-the-art sampling-based MPC, resulting in efficient collection of training data and successful avoidance of uncertain state-action spaces. We conduct experiments on both autonomous vehicles and wheeled robots, showing promising results for both exploration and deployment.

* 2023 Robotics: Science and Systems (RSS). Project page: https://taekyung.me/rss2023-bridging

Via

Access Paper or Ask Questions

Learning Terrain-Aware Kinodynamic Model for Autonomous Off-Road Rally Driving With Model Predictive Path Integral Control

May 01, 2023

Hojin Lee, Taekyung Kim, Jungwi Mun, Wonsuk Lee

Abstract:High-speed autonomous driving in off-road environments has immense potential for various applications, but it also presents challenges due to the complexity of vehicle-terrain interactions. In such environments, it is crucial for the vehicle to predict its motion and adjust its controls proactively in response to environmental changes, such as variations in terrain elevation. To this end, we propose a method for learning terrain-aware kinodynamic model which is conditioned on both proprioceptive and exteroceptive information. The proposed model generates reliable predictions of 6-degree-of-freedom motion and can even estimate contact interactions without requiring ground truth force data during training. This enables the design of a safe and robust model predictive controller through appropriate cost function design which penalizes sampled trajectories with unstable motion, unsafe interactions, and high levels of uncertainty derived from the model. We demonstrate the effectiveness of our approach through experiments on a simulated off-road track, showing that our proposed model-controller pair outperforms the baseline and ensures robust high-speed driving performance without control failure.

* Our video can be found at https://youtu.be/VXf_prNQnJo

Via

Access Paper or Ask Questions

What Do Self-Supervised Vision Transformers Learn?

May 01, 2023

Namuk Park, Wonjae Kim, Byeongho Heo, Taekyung Kim, Sangdoo Yun

Abstract:We present a comparative study on how and why contrastive learning (CL) and masked image modeling (MIM) differ in their representations and in their performance of downstream tasks. In particular, we demonstrate that self-supervised Vision Transformers (ViTs) have the following properties: (1) CL trains self-attentions to capture longer-range global patterns than MIM, such as the shape of an object, especially in the later layers of the ViT architecture. This CL property helps ViTs linearly separate images in their representation spaces. However, it also makes the self-attentions collapse into homogeneity for all query tokens and heads. Such homogeneity of self-attention reduces the diversity of representations, worsening scalability and dense prediction performance. (2) CL utilizes the low-frequency signals of the representations, but MIM utilizes high-frequencies. Since low- and high-frequency information respectively represent shapes and textures, CL is more shape-oriented and MIM more texture-oriented. (3) CL plays a crucial role in the later layers, while MIM mainly focuses on the early layers. Upon these analyses, we find that CL and MIM can complement each other and observe that even the simplest harmonization can help leverage the advantages of both methods. The code is available at https://github.com/naver-ai/cl-vs-mim.

* ICLR 2023

Via

Access Paper or Ask Questions

Panoramic Image-to-Image Translation

Apr 11, 2023

Soohyun Kim, Junho Kim, Taekyung Kim, Hwan Heo, Seungryong Kim, Jiyoung Lee, Jin-Hwa Kim

Figure 1 for Panoramic Image-to-Image Translation

Figure 2 for Panoramic Image-to-Image Translation

Figure 3 for Panoramic Image-to-Image Translation

Figure 4 for Panoramic Image-to-Image Translation

Abstract:In this paper, we tackle the challenging task of Panoramic Image-to-Image translation (Pano-I2I) for the first time. This task is difficult due to the geometric distortion of panoramic images and the lack of a panoramic image dataset with diverse conditions, like weather or time. To address these challenges, we propose a panoramic distortion-aware I2I model that preserves the structure of the panoramic images while consistently translating their global style referenced from a pinhole image. To mitigate the distortion issue in naive 360 panorama translation, we adopt spherical positional embedding to our transformer encoders, introduce a distortion-free discriminator, and apply sphere-based rotation for augmentation and its ensemble. We also design a content encoder and a style encoder to be deformation-aware to deal with a large domain gap between panoramas and pinhole images, enabling us to work on diverse conditions of pinhole images. In addition, considering the large discrepancy between panoramas and pinhole images, our framework decouples the learning procedure of the panoramic reconstruction stage from the translation stage. We show distinct improvements over existing I2I models in translating the StreetLearn dataset in the daytime into diverse conditions. The code will be publicly available online for our community.

Via

Access Paper or Ask Questions

Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Feb 03, 2023

Hwan Heo, Taekyung Kim, Jiyoung Lee, Jaewon Lee, Soohyun Kim, Hyunwoo J. Kim, Jin-Hwa Kim

Figure 1 for Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Figure 2 for Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Figure 3 for Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Figure 4 for Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Abstract:Multi-resolution hash encoding has recently been proposed to reduce the computational cost of neural renderings, such as NeRF. This method requires accurate camera poses for the neural renderings of given scenes. However, contrary to previous methods jointly optimizing camera poses and 3D scenes, the naive gradient-based camera pose refinement method using multi-resolution hash encoding severely deteriorates performance. We propose a joint optimization algorithm to calibrate the camera pose and learn a geometric representation using efficient multi-resolution hash encoding. Showing that the oscillating gradient flows of hash encoding interfere with the registration of camera poses, our method addresses the issue by utilizing smooth interpolation weighting to stabilize the gradient oscillation for the ray samplings across hash grids. Moreover, the curriculum training procedure helps to learn the level-wise hash encoding, further increasing the pose refinement. Experiments on the novel-view synthesis datasets validate that our learning frameworks achieve state-of-the-art performance and rapid convergence of neural rendering, even when initial camera poses are unknown.

Via

Access Paper or Ask Questions

Uncertainty Reduction for 3D Point Cloud Self-Supervised Traversability Estimation

Nov 21, 2022

Jihwan Bae, Junwon Seo, Taekyung Kim, Hae-gon Jeon, Kiho Kwak, Inwook Shim

Figure 1 for Uncertainty Reduction for 3D Point Cloud Self-Supervised Traversability Estimation

Figure 2 for Uncertainty Reduction for 3D Point Cloud Self-Supervised Traversability Estimation

Figure 3 for Uncertainty Reduction for 3D Point Cloud Self-Supervised Traversability Estimation

Figure 4 for Uncertainty Reduction for 3D Point Cloud Self-Supervised Traversability Estimation

Abstract:Traversability estimation in off-road environments requires a robust perception system. Recently, approaches to learning a traversability estimation from past vehicle experiences in a self-supervised manner are arising as they can greatly reduce human labeling costs and labeling errors. Nonetheless, the learning setting from self-supervised traversability estimation suffers from congenital uncertainties that appear according to the scarcity of negative information. Negative data are rarely harvested as the system can be severely damaged while logging the data. To mitigate the uncertainty, we introduce a method to incorporate unlabeled data in order to leverage the uncertainty. First, we design a learning architecture that inputs query and support data. Second, unlabeled data are assigned based on the proximity in the metric space. Third, a new metric for uncertainty measures is introduced. We evaluated our approach on our own dataset, `Dtrail', which is composed of a wide variety of negative data.

Via

Access Paper or Ask Questions

ScaTE: A Scalable Framework for Self-Supervised Traversability Estimation in Unstructured Environments

Sep 14, 2022

Junwon Seo, Taekyung Kim, Kiho Kwak, Jihong Min, Inwook Shim

Figure 1 for ScaTE: A Scalable Framework for Self-Supervised Traversability Estimation in Unstructured Environments

Figure 2 for ScaTE: A Scalable Framework for Self-Supervised Traversability Estimation in Unstructured Environments

Figure 3 for ScaTE: A Scalable Framework for Self-Supervised Traversability Estimation in Unstructured Environments

Figure 4 for ScaTE: A Scalable Framework for Self-Supervised Traversability Estimation in Unstructured Environments

Abstract:For the safe and successful navigation of autonomous vehicles in unstructured environments, the traversability of terrain should vary based on the driving capabilities of the vehicles. Actual driving experience can be utilized in a self-supervised fashion to learn vehicle-specific traversability. However, existing methods for learning self-supervised traversability are not highly scalable for learning the traversability of various vehicles. In this work, we introduce a scalable framework for learning self-supervised traversability, which can learn the traversability directly from vehicle-terrain interaction without any human supervision. We train a neural network that predicts the proprioceptive experience that a vehicle would undergo from 3D point clouds. Using a novel PU learning method, the network simultaneously identifies non-traversable regions where estimations can be overconfident. With driving data of various vehicles gathered from simulation and the real world, we show that our framework is capable of learning the self-supervised traversability of various vehicles. By integrating our framework with a model predictive controller, we demonstrate that estimated traversability results in effective navigation that enables distinct maneuvers based on the driving characteristics of the vehicles. In addition, experimental results validate the ability of our method to identify and avoid non-traversable regions.

* Our video can be found at https://youtu.be/kSvqjHDqmIk

Via

Access Paper or Ask Questions