Abstract:Imitation learning methods have demonstrated considerable success in teaching autonomous systems complex tasks through expert demonstrations. However, a limitation of these methods is their lack of interpretability, particularly in understanding the specific task the learning agent aims to accomplish. In this paper, we propose a novel imitation learning method that combines Signal Temporal Logic (STL) inference and control synthesis, enabling the explicit representation of the task as an STL formula. This approach not only provides a clear understanding of the task but also allows for the incorporation of human knowledge and adaptation to new scenarios through manual adjustments of the STL formulae. Additionally, we employ a Generative Adversarial Network (GAN)-inspired training approach for both the inference and the control policy, effectively narrowing the gap between the expert and learned policies. The effectiveness of our algorithm is demonstrated through two case studies, showcasing its practical applicability and adaptability.
Abstract:We present a Deep Reinforcement Learning (DRL) algorithm for a task-guided robot with unknown continuous-time dynamics deployed in a large-scale complex environment. Linear Temporal Logic (LTL) is applied to express a rich robotic specification. To overcome the environmental challenge, we propose a novel path planning-guided reward scheme that is dense over the state space, and crucially, robust to infeasibility of computed geometric paths due to the unknown robot dynamics. To facilitate LTL satisfaction, our approach decomposes the LTL mission into sub-tasks that are solved using distributed DRL, where the sub-tasks are trained in parallel, using Deep Policy Gradient algorithms. Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale complex environments.
Abstract:Real-time and human-interpretable decision-making in cyber-physical systems is a significant but challenging task, which usually requires predictions of possible future events from limited data. In this paper, we introduce a time-incremental learning framework: given a dataset of labeled signal traces with a common time horizon, we propose a method to predict the label of a signal that is received incrementally over time, referred to as prefix signal. Prefix signals are the signals that are being observed as they are generated, and their time length is shorter than the common horizon of signals. We present a novel decision-tree based approach to generate a finite number of Signal Temporal Logic (STL) specifications from the given dataset, and construct a predictor based on them. Each STL specification, as a binary classifier of time-series data, captures the temporal properties of the dataset over time. The predictor is constructed by assigning time-variant weights to the STL formulas. The weights are learned by using neural networks, with the goal of minimizing the misclassification rate for the prefix signals defined over the given dataset. The learned predictor is used to predict the label of a prefix signal, by computing the weighted sum of the robustness of the prefix signal with respect to each STL formula. The effectiveness and classification performance of our algorithm are evaluated on an urban-driving and a naval-surveillance case studies.
Abstract:Learning dynamical systems properties from data provides important insights that help us understand such systems and mitigate undesired outcomes. In this work, we propose a framework for learning spatio-temporal (ST) properties as formal logic specifications from data. We introduce SVM-STL, an extension of Signal Signal Temporal Logic (STL), capable of specifying spatial and temporal properties of a wide range of dynamical systems that exhibit time-varying spatial patterns. Our framework utilizes machine learning techniques to learn SVM-STL specifications from system executions given by sequences of spatial patterns. We present methods to deal with both labeled and unlabeled data. In addition, given system requirements in the form of SVM-STL specifications, we provide an approach for parameter synthesis to find parameters that maximize the satisfaction of such specifications. Our learning framework and parameter synthesis approach are showcased in an example of a reaction-diffusion system.
Abstract:Time-series data classification is central to the analysis and control of autonomous systems, such as robots and self-driving cars. Temporal logic-based learning algorithms have been proposed recently as classifiers of such data. However, current frameworks are either inaccurate for real-world applications, such as autonomous driving, or they generate long and complicated formulae that lack interpretability. To address these limitations, we introduce a novel learning method, called Boosted Concise Decision Trees (BCDTs), to generate binary classifiers that are represented as Signal Temporal Logic (STL) formulae. Our algorithm leverages an ensemble of Concise Decision Trees (CDTs) to improve the classification performance, where each CDT is a decision tree that is empowered by a set of techniques to generate simpler formulae and improve interpretability. The effectiveness and classification performance of our algorithm are evaluated on naval surveillance and urban-driving case studies.
Abstract:Many autonomous systems, such as robots and self-driving cars, involve real-time decision making in complex environments, and require prediction of future outcomes from limited data. Moreover, their decisions are increasingly required to be interpretable to humans for safe and trustworthy co-existence. This paper is a first step towards interpretable learning-based robot control. We introduce a novel learning problem, called incremental formula and predictor learning, to generate binary classifiers with temporal logic structure from time-series data. The classifiers are represented as pairs of Signal Temporal Logic (STL) formulae and predictors for their satisfaction. The incremental property provides prediction of labels for prefix signals that are revealed over time. We propose a boosted decision-tree algorithm that leverages weak, but computationally inexpensive, learners to increase prediction and runtime performance. The effectiveness and classification accuracy of our algorithms are evaluated on autonomous-driving and naval surveillance case studies.
Abstract:This paper presents a novel two-level control architecture for a fully autonomous vehicle in a deterministic environment, which can handle traffic rules as specifications and low-level vehicle control with real-time performance. At the top level, we use a simple representation of the environment and vehicle dynamics to formulate a linear Model Predictive Control (MPC) problem. We describe the traffic rules and safety constraints using Signal Temporal Logic (STL) formulas, which are mapped to mixed integer-linear constraints in the optimization problem. The solution obtained at the top level is used at the bottom-level to determine the best control command for satisfying the constraints in a more detailed framework. At the bottom-level, specification-based runtime monitoring techniques, together with detailed representations of the environment and vehicle dynamics, are used to compensate for the mismatch between the simple models used in the MPC and the real complex models. We obtain substantial improvements over existing approaches in the literature in the sense of runtime performance and we validate the effectiveness of our proposed control approach in the simulator CARLA.
Abstract:In this paper, we are concerned with the design of a set of controllers, on a cell decomposition of a polygonal environment through Linear Programming. The core of our proposed method consists of a convex min-max formulation that synthesizes an output-feedback controller, based on relative displacement measurements with respect to a set of landmarks. The optimization problem is formulated using piece-wise linear Control Lyapunov Function and Control Barrier Function constraints, to provide guarantees of stability and safety. The inner maximization problem ensures that these constraints are met by all the points in each cell, while the outer minimization problem balances the different constraints to optimize robustness. We convert this min-max optimization problem to a regular Linear Programming problem, by forming the dual of the inner maximization problem. Although in principle our approach is applicable to any system with piecewise linear dynamics, in this paper as a proof of concept, we apply it to first and second order integrators. We show through simulations that the resulting controllers are robust to significant deformations of the environment.