Abstract:Reinforcement Learning (RL) has enabled vast performance improvements for robotics systems. To achieve these results though, the agent often must randomly explore the environment, which for safety critical systems presents a significant challenge. Barrier functions can solve this challenge by enabling an override that approximates the RL control input as closely as possible without violating a safety constraint. Unfortunately, this override can be computationally intractable in cases where the dynamics are not convex in the control input or when time is discrete, as is often the case when training RL systems. We therefore consider these cases, developing novel barrier functions for two non-convex systems (fixed wing aircraft and self-driving cars performing lane merging with adaptive cruise control) in discrete time. Although solving for an online and optimal override is in general intractable when the dynamics are nonconvex in the control input, we investigate approximate solutions, finding that these approximations enable performance commensurate with baseline RL methods with zero safety violations. In particular, even without attempting to solve for the optimal override at all, performance is still competitive with baseline RL performance. We discuss the tradeoffs of the approximate override solutions including performance and computational tractability.
Abstract:This paper demonstrates that in some cases the safety override arising from the use of a barrier function can be needlessly restrictive. In particular, we examine the case of fixed wing collision avoidance and show that when using a barrier function, there are cases where two fixed wing aircraft can come closer to colliding than if there were no barrier function at all. In addition, we construct cases where the barrier function labels the system as unsafe even when the vehicles start arbitrarily far apart. In other words, the barrier function ensures safety but with unnecessary costs to performance. We therefore introduce model free barrier functions which take a data driven approach to creating a barrier function. We demonstrate the effectiveness of model free barrier functions in a collision avoidance simulation of two fixed-wing aircraft.
Abstract:In this paper we discuss how to construct a barrier certificate for a control affine system subject to actuator constraints and motivate this discussion by examining collision avoidance for fixed-wing unmanned aerial vehicles (UAVs). In particular, the theoretical development in this paper is used to create a barrier certificate that ensures that two UAVs will not collide for all future times assuming the vehicles start in a safe starting configuration. We then extend this development by discussing how to ensure that multiple safety constraints are simultaneously satisfied in a decentralized manner (e.g., ensure robot distances are above some threshold for all pairwise combinations of UAVs for all future times) while ensuring output actuator commands are within specified limits. We validate the theoretical developments of this paper in the simulator SCRIMMAGE with a simulation of 20 UAVs that maintain safe distances from each other even though their nominal paths would otherwise cause a collision.