Motion planners are essential for the safe operation of automated vehicles across various scenarios. However, no motion planning algorithm has achieved perfection in the literature, and improving its performance is often time-consuming and labor-intensive. To tackle the aforementioned issues, we present DrPlanner, the first framework designed to automatically diagnose and repair motion planners using large language models. Initially, we generate a structured description of the planner and its planned trajectories from both natural and programming languages. Leveraging the profound capabilities of large language models in addressing reasoning challenges, our framework returns repaired planners with detailed diagnostic descriptions. Furthermore, the framework advances iteratively with continuous feedback from the evaluation of the repaired outcomes. Our approach is validated using search-based motion planners; experimental results highlight the need of demonstrations in the prompt and the ability of our framework in identifying and rectifying elusive issues effectively.
Autonomous vehicles have to obey traffic rules. These rules are often formalized using temporal logic, resulting in constraints that are hard to solve using optimization-based motion planners. Reinforcement Learning (RL) is a promising method to find motion plans adhering to temporal logic specifications. However, vanilla RL algorithms are based on random exploration, which is inherently unsafe. To address this issue, we propose a provably safe RL approach that always complies with traffic rules. As a specific application area, we consider vessels on the open sea, which must adhere to the Convention on the International Regulations for Preventing Collisions at Sea (COLREGS). We introduce an efficient verification approach that determines the compliance of actions with respect to the COLREGS formalized using temporal logic. Our action verification is integrated into the RL process so that the agent only selects verified actions. In contrast to agents that only integrate the traffic rule information in the reward function, our provably safe agent always complies with the formalized rules in critical maritime traffic situations and, thus, never causes a collision.
Neural networks are vulnerable to adversarial attacks, i.e., small input perturbations can result in substantially different outputs of a neural network. Safety-critical environments require neural networks that are robust against input perturbations. However, training and formally verifying robust neural networks is challenging. We address this challenge by employing, for the first time, a end-to-end set-based training procedure that trains robust neural networks for formal verification. Our training procedure drastically simplifies the subsequent formal robustness verification of the trained neural network. While previous research has predominantly focused on augmenting neural network training with adversarial attacks, our approach leverages set-based computing to train neural networks with entire sets of perturbed inputs. Moreover, we demonstrate that our set-based training procedure effectively trains robust neural networks, which are easier to verify. In many cases, set-based trained neural networks outperform neural networks trained with state-of-the-art adversarial attacks.
Deep reinforcement learning (RL) has shown promising results in robot motion planning with first attempts in human-robot collaboration (HRC). However, a fair comparison of RL approaches in HRC under the constraint of guaranteed safety is yet to be made. We, therefore, present human-robot gym, a benchmark for safe RL in HRC. Our benchmark provides eight challenging, realistic HRC tasks in a modular simulation framework. Most importantly, human-robot gym includes a safety shield that provably guarantees human safety. We are, thereby, the first to provide a benchmark to train RL agents that adhere to the safety specifications of real-world HRC. This bridges a critical gap between theoretic RL research and its real-world deployment. Our evaluation of six environments led to three key results: (a) the diverse nature of the tasks offered by human-robot gym creates a challenging benchmark for state-of-the-art RL methods, (b) incorporating expert knowledge in the RL training in the form of an action-based reward can outperform the expert, and (c) our agents negligibly overfit to training data.
Reachability analysis is a powerful tool for computing the set of states or outputs reachable for a system. While previous work has focused on systems described by state-space models, we present the first methods to compute reachable sets of ARMAX models - one of the most common input-output models originating from data-driven system identification. The first approach we propose can only be used with dependency-preserving set representations such as symbolic zonotopes, while the second one is valid for arbitrary set representations but relies on a reformulation of the ARMAX model. By analyzing the computational complexities, we show that both approaches scale quadratically with respect to the time horizon of the reachability problem when using symbolic zonotopes. To reduce the computational complexity, we propose a third approach that scales linearly with respect to the time horizon when using set representations that are closed under Minkowski addition and linear transformation and that satisfy that the computational complexity of the Minkowski sum is independent of the representation size of the operands. Our numerical experiments demonstrate that the reachable sets of ARMAX models are tighter than the reachable sets of equivalent state space models in case of unknown initial states. Therefore, this methodology has the potential to significantly reduce the conservatism of various verification techniques.
While current research and development of autonomous driving primarily focuses on developing new features and algorithms, the transfer from isolated software components into an entire software stack has been covered sparsely. Besides that, due to the complexity of autonomous software stacks and public road traffic, the optimal validation of entire stacks is an open research problem. Our paper targets these two aspects. We present our autonomous research vehicle EDGAR and its digital twin, a detailed virtual duplication of the vehicle. While the vehicle's setup is closely related to the state of the art, its virtual duplication is a valuable contribution as it is crucial for a consistent validation process from simulation to real-world tests. In addition, different development teams can work with the same model, making integration and testing of the software stacks much easier, significantly accelerating the development process. The real and virtual vehicles are embedded in a comprehensive development environment, which is also introduced. All parameters of the digital twin are provided open-source at https://github.com/TUMFTM/edgar_digital_twin.
Industrial robots are designed as general-purpose hardware, which limits their ability to adapt to changing task requirements or environments. Modular robots, on the other hand, offer flexibility and can be easily customized to suit diverse needs. The morphology, i.e., the form and structure of a robot, significantly impacts the primary performance metrics acquisition cost, cycle time, and energy efficiency. However, identifying an optimal module composition for a specific task remains an open problem, presenting a substantial hurdle in developing task-tailored modular robots. Previous approaches either lack adequate exploration of the design space or the possibility to adapt to complex tasks. We propose combining a genetic algorithm with a lexicographic evaluation of solution candidates to overcome this problem and navigate search spaces exceeding those in prior work by magnitudes in the number of possible compositions. We demonstrate that our approach outperforms a state-of-the-art baseline and is able to synthesize modular robots for industrial tasks in cluttered environments.
Verifying the correct behavior of robots in contact tasks is challenging due to model uncertainties associated with contacts. Standard methods for testing often fall short since all (uncountable many) solutions cannot be obtained. Instead, we propose to formally and efficiently verify robot behaviors in contact tasks using reachability analysis, which enables checking all the reachable states against user-provided specifications. To this end, we extend the state of the art in reachability analysis for hybrid (mixed discrete and continuous) dynamics subject to discrete-time input trajectories. In particular, we present a novel and scalable guard intersection approach to reliably compute the complex behavior caused by contacts. We model robots subject to contacts as hybrid automata in which crucial time delays are included. The usefulness of our approach is demonstrated by verifying safe human-robot interaction in the presence of constrained collisions, which was out of reach for existing methods.
Scenarios are a crucial element for developing, testing, and verifying autonomous driving systems. However, open-source scenarios are often formulated using different terminologies. This limits their usage across different applications as many scenario representation formats are not directly compatible with each other. To address this problem, we present the first open-source converter from the OpenSCENARIO format to the CommonRoad format, which are two of the most popular scenario formats used in autonomous driving. Our converter employs a simulation tool to execute the dynamic elements defined by OpenSCENARIO. The converter is available at commonroad.in.tum.de and we demonstrate its usefulness by converting publicly available scenarios in the OpenSCENARIO format and evaluating them using CommonRoad tools.