Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sang-Hyun Lee

Chain-of-Goals Hierarchical Policy for Long-Horizon Offline Goal-Conditioned RL

Feb 03, 2026

Jinwoo Choi, Sang-Hyun Lee, Seung-Woo Seo

Abstract:Offline goal-conditioned reinforcement learning remains challenging for long-horizon tasks. While hierarchical approaches mitigate this issue by decomposing tasks, most existing methods rely on separate high- and low-level networks and generate only a single intermediate subgoal, making them inadequate for complex tasks that require coordinating multiple intermediate decisions. To address this limitation, we draw inspiration from the chain-of-thought paradigm and propose the Chain-of-Goals Hierarchical Policy (CoGHP), a novel framework that reformulates hierarchical decision-making as autoregressive sequence modeling within a unified architecture. Given a state and a final goal, CoGHP autoregressively generates a sequence of latent subgoals followed by the primitive action, where each latent subgoal acts as a reasoning step that conditions subsequent predictions. To implement this efficiently, we pioneer the use of an MLP-Mixer backbone, which supports cross-token communication and captures structural relationships among state, goal, latent subgoals, and action. Across challenging navigation and manipulation benchmarks, CoGHP consistently outperforms strong offline baselines, demonstrating improved performance on long-horizon tasks.

* 22 pages

Via

Access Paper or Ask Questions

Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

May 22, 2024

Sang-Hyun Lee, Daehyeok Kwon, Seung-Woo Seo

Figure 1 for Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

Figure 2 for Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

Figure 3 for Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

Figure 4 for Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

Abstract:Reinforcement learning (RL) provides a compelling framework for enabling autonomous vehicles to continue to learn and improve diverse driving behaviors on their own. However, training real-world autonomous vehicles with current RL algorithms presents several challenges. One critical challenge, often overlooked in these algorithms, is the need to reset a driving environment between every episode. While resetting an environment after each episode is trivial in simulated settings, it demands significant human intervention in the real world. In this paper, we introduce a novel autonomous algorithm that allows off-the-shelf RL algorithms to train an autonomous vehicle with minimal human intervention. Our algorithm takes into account the learning progress of the autonomous vehicle to determine when to abort episodes before it enters unsafe states and where to reset it for subsequent episodes in order to gather informative transitions. The learning progress is estimated based on the novelty of both current and future states. We also take advantage of rule-based autonomous driving algorithms to safely reset an autonomous vehicle to an initial state. We evaluate our algorithm against baselines on diverse urban driving tasks. The experimental results show that our algorithm is task-agnostic and achieves better driving performance with fewer manual resets than baselines.

* 8 pages, 6 figures, 2 tables, conference

Via

Access Paper or Ask Questions

Results and Lessons Learned from Autonomous Driving Transportation Services in Airfield, Crowded Indoor, and Urban Environments

Mar 02, 2024

Doosan Baek, Sanghyun Kim, Seung-Woo Seo, Sang-Hyun Lee

Abstract:Autonomous vehicles have been actively investigated over the past few decades. Several recent works show the potential of autonomous driving transportation services in urban environments with impressive experimental results. However, these works note that autonomous vehicles are still occasionally inferior to expert drivers in complex scenarios. Furthermore, they do not focus on the possibilities of autonomous driving transportation services in other areas beyond urban environments. This paper presents the research results and lessons learned from autonomous driving transportation services in airfield, crowded indoor, and urban environments. We discuss how we address several unique challenges in these diverse environments. We also offer an overview of remaining challenges that have not received much attention but must be addressed. This paper aims to share our unique experience to support researchers who are interested in realizing the potential of autonomous vehicles in various real-world environments.

* 8 pages, 7 figures, 4 tables

Via

Access Paper or Ask Questions

Imagination-augmented Hierarchical Reinforcement Learning for Safe and Interactive Autonomous Driving in Urban Environments

Nov 17, 2023

Sang-Hyun Lee, Yoonjae Jung, Seung-Woo Seo

Abstract:Hierarchical reinforcement learning (HRL) has led to remarkable achievements in diverse fields. However, existing HRL algorithms still cannot be applied to real-world navigation tasks. These tasks require an agent to perform safety-aware behaviors and interact with surrounding objects in dynamic environments. In addition, an agent in these tasks should perform consistent and structured exploration as they are long-horizon and have complex structures with diverse objects and task-specific rules. Designing HRL agents that can handle these challenges in real-world navigation tasks is an open problem. In this paper, we propose imagination-augmented HRL (IAHRL), a new and general navigation algorithm that allows an agent to learn safe and interactive behaviors in real-world navigation tasks. Our key idea is to train a hierarchical agent in which a high-level policy infers interactions by interpreting behaviors imagined with low-level policies. Specifically, the high-level policy is designed with a permutation-invariant attention mechanism to determine which low-level policy generates the most interactive behavior, and the low-level policies are implemented with an optimization-based behavior planner to generate safe and structured behaviors following task-specific rules. To evaluate our algorithm, we introduce five complex urban driving tasks, which are among the most challenging real-world navigation tasks. The experimental results indicate that our hierarchical agent performs safety-aware behaviors and properly interacts with surrounding vehicles, achieving higher success rates and lower average episode steps than baselines in urban driving tasks.

* 11 pages, 8 figures

Via

Access Paper or Ask Questions

Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge

Nov 15, 2023

Sang-Hyun Lee, Seung-Woo Seo

Figure 1 for Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge

Figure 2 for Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge

Figure 3 for Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge

Figure 4 for Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge

Abstract:A significant bottleneck in applying current reinforcement learning algorithms to real-world scenarios is the need to reset the environment between every episode. This reset process demands substantial human intervention, making it difficult for the agent to learn continuously and autonomously. Several recent works have introduced autonomous reinforcement learning (ARL) algorithms that generate curricula for jointly training reset and forward policies. While their curricula can reduce the number of required manual resets by taking into account the agent's learning progress, they rely on task-specific knowledge, such as predefined initial states or reset reward functions. In this paper, we propose a novel ARL algorithm that can generate a curriculum adaptive to the agent's learning progress without task-specific knowledge. Our curriculum empowers the agent to autonomously reset to diverse and informative initial states. To achieve this, we introduce a success discriminator that estimates the success probability from each initial state when the agent follows the forward policy. The success discriminator is trained with relabeled transitions in a self-supervised manner. Our experimental results demonstrate that our ARL algorithm can generate an adaptive curriculum and enable the agent to efficiently bootstrap to solve sparse-reward maze navigation tasks, outperforming baselines with significantly fewer manual resets.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

Occlusion-aware Risk Assessment and Driving Strategy for Autonomous Vehicles Using Simplified Reachability Quantification

Jul 02, 2023

Hyunwoo Park, Jongseo Choi, Hyuntai Chin, Sang-Hyun Lee, Doosan Baek

Figure 1 for Occlusion-aware Risk Assessment and Driving Strategy for Autonomous Vehicles Using Simplified Reachability Quantification

Figure 2 for Occlusion-aware Risk Assessment and Driving Strategy for Autonomous Vehicles Using Simplified Reachability Quantification

Figure 3 for Occlusion-aware Risk Assessment and Driving Strategy for Autonomous Vehicles Using Simplified Reachability Quantification

Figure 4 for Occlusion-aware Risk Assessment and Driving Strategy for Autonomous Vehicles Using Simplified Reachability Quantification

Abstract:There are several unresolved challenges for autonomous vehicles. One of them is safely navigating among occluded pedestrians and vehicles. Much of the previous work tried to solve this problem by generating phantom cars and assessing their risk. In this paper, motivated by the previous works, we propose an algorithm that efficiently assesses risks of phantom pedestrians/vehicles using Simplified Reachability Quantification. We utilized this occlusion risk to set a speed limit at the risky position when planning the velocity profile of an autonomous vehicle. This allows an autonomous vehicle to safely and efficiently drive in occluded areas. The proposed algorithm was evaluated in various scenarios in the CARLA simulator and it reduced the average collision rate by 6.14X, the discomfort score by 5.03X, while traversal time was increased by 1.48X compared to baseline 1, and computation time was reduced by 20.15X compared to baseline 2.

* 9 pages, 9 figures

Via

Access Paper or Ask Questions