Reinforcement Learning (RL) algorithms have achieved remarkable performance in decision making and control tasks due to their ability to reason about long-term, cumulative reward using trial and error. However, during RL training, applying this trial-and-error approach to real-world robots operating in safety critical environment may lead to collisions. To address this challenge, this paper proposes a Reachability-based Trajectory Safeguard (RTS), which leverages trajectory parameterization and reachability analysis to ensure safety while a policy is being learned. This method ensures a robot with continuous action space can be trained from scratch safely in real-time. Importantly, this safety layer can still be applied after a policy has been learned. The efficacy of this method is illustrated on three nonlinear robot models, including a 12-D quadrotor drone, in simulation. By ensuring safety with RTS, this paper demonstrates that the proposed algorithm is not only safe, but can achieve a higher reward in a considerably shorter training time when compared to a non-safe counterpart.
Nonlinear dynamical systems can be made easier to control by lifting them into the space of observable functions, where their evolution is described by the linear Koopman operator. This paper describes how the Koopman operator can be used to generate approximate linear, bilinear, and nonlinear model realizations from data, and argues in favor of bilinear realizations for characterizing systems with unknown dynamics. Necessary and sufficient conditions for a dynamical system to have a valid linear or bilinear realization over a given set of observable functions are presented and used to show that every control-affine system admits an infinite-dimensional bilinear realization, but does not necessarily admit a linear one. Therefore, approximate bilinear realizations constructed from generic sets of basis functions tend to improve as the number of basis functions increases, whereas approximate linear realizations may not. To demonstrate the advantages of bilinear Koopman realizations for control, a linear, bilinear, and nonlinear Koopman model realization of a simulated robot arm are constructed from data. In a trajectory following task, the bilinear realization exceeds the prediction accuracy of the linear realization and the computational efficiency of the nonlinear realization when incorporated into a model predictive control framework.
People with lower-limb loss, the majority of which use passive prostheses, exhibit a high incidence of falls each year. Powered lower-limb prostheses have the potential to reduce fall rates by actively helping the user recover from a stumble, but the unpredictability of the human response makes it difficult to design controllers that ensure a successful recovery. This paper presents a method called TRIP-RTD (Trip Recovery in Prostheses via Reachability-based Trajectory Design) for online trajectory planning in a knee prosthesis during and after a stumble that can accommodate a set of possible predictions of human behavior. Using this predicted set of human behavior, the proposed method computes a parameterized reachable set of trajectories for the human-prosthesis system. To ensure safety at run-time, TRIP-RTD selects a trajectory for the prosthesis that guarantees that all possible states of the human-prosthesis system at touchdown arrive in the basin of attraction of the nominal behavior of the system. In simulated stumble experiments where a nominal phase-based controller was unable to help the system recover, TRIP-RTD produced trajectories in under 101 ms that led to successful recoveries for all feasible solutions found.
Pedestrian trajectory prediction is an essential task in robotic applications such as autonomous driving and robot navigation. State-of-the-art trajectory predictors use a conditional variational autoencoder (CVAE) with recurrent neural networks (RNNs) to encode observed trajectories and decode multi-modal future trajectories. This process can suffer from accumulated errors over long prediction horizons (>=2 seconds). This paper presents BiTraP, a goal-conditioned bi-directional multi-modal trajectory prediction method based on the CVAE. BiTraP estimates the goal (end-point) of trajectories and introduces a novel bi-directional decoder to improve longer-term trajectory prediction accuracy. Extensive experiments show that BiTraP generalizes to both first-person view (FPV) and bird's-eye view (BEV) scenarios and outperforms state-of-the-art results by ~10-50%. We also show that different choices of non-parametric versus parametric target models in the CVAE directly influence the predicted multi-modal trajectory distributions. These results provide guidance on trajectory predictor design for robotic applications such as collision avoidance and navigation systems.
The continual improvement of 3D sensors has driven the development of algorithms to perform point cloud analysis. In fact, techniques for point cloud classification and segmentation have in recent years achieved incredible performance driven in part by leveraging large synthetic datasets. Unfortunately these same state-of-the-art approaches perform poorly when applied to incomplete point clouds. This limitation of existing algorithms is particularly concerning since point clouds generated by 3D sensors in the real world are usually incomplete due to perspective view or occlusion by other objects. This paper proposes a general model for partial point clouds analysis wherein the latent feature encoding a complete point clouds is inferred by applying a local point set voting strategy. In particular, each local point set constructs a vote that corresponds to a distribution in the latent space, and the optimal latent feature is the one with the highest probability. This approach ensures that any subsequent point cloud analysis is robust to partial observation while simultaneously guaranteeing that the proposed model is able to output multiple possible results. This paper illustrates that this proposed method achieves state-of-the-art performance on shape classification, part segmentation and point cloud completion.
Uncooled microbolometers can enable robots to see in the absence of visible illumination by imaging the "heat" radiated from the scene. Despite this ability to see in the dark, these sensors suffer from significant motion blur. This has limited their application on robotic systems. As described in this paper, this motion blur arises due to the thermal inertia of each pixel. This has meant that traditional motion deblurring techniques, which rely on identifying an appropriate spatial blur kernel to perform spatial deconvolution, are unable to reliably perform motion deblurring on thermal camera images. To address this problem, this paper formulates reversing the effect of thermal inertia at a single pixel as a Least Absolute Shrinkage and Selection Operator (LASSO) problem which we can solve rapidly using a quadratic programming solver. By leveraging sparsity and a high frame rate, this pixel-wise LASSO formulation is able to recover motion deblurred frames of thermal videos without using any spatial information. To compare its quality against state-of-the-art visible camera based deblurring methods, this paper evaluated the performance of a family of pre-trained object detectors on a set of images restored by different deblurring algorithms. All evaluated object detectors performed systematically better on images restored by the proposed algorithm rather than any other tested, state-of-the-art methods.
Pedestrians and drivers interact closely in a wide range of environments. Autonomous vehicles (AVs) correspondingly face the need to predict pedestrians' future trajectories in these same environments. Traditional model-based prediction methods have been limited to making predictions in highly structured scenes with signalized intersections, marked crosswalks, or curbs. Deep learning methods have instead leveraged datasets to learn predictive features that generalize across scenes, at the cost of model interpretability. This paper aims to achieve both widely applicable and interpretable predictions by proposing a risk-based attention mechanism to learn when pedestrians yield, and a model of vehicle influence to learn how yielding affects motion. A novel probabilistic method, Off the Sidewalk Predictions (OSP), uses these to achieve accurate predictions in both shared spaces and traditional scenes. Experiments on urban datasets demonstrate that the realtime method achieves state-of-the-art performance.
Attack detection and mitigation strategies for cyberphysical systems (CPS) are an active area of research, and researchers have developed a variety of attack-detection tools such as dynamic watermarking. However, such methods often make assumptions that are difficult to guarantee, such as exact knowledge of the distribution of measurement noise. Here, we develop a new dynamic watermarking method that we call covariance-robust dynamic watermarking, which is able to handle uncertainties in the covariance of measurement noise. Specifically, we consider two cases. In the first this covariance is fixed but unknown, and in the second this covariance is slowly-varying. For our tests, we only require knowledge of a set within which the covariance lies. Furthermore, we connect this problem to that of algorithmic fairness and the nascent field of fair hypothesis testing, and we show that our tests satisfy some notions of fairness. Finally, we exhibit the efficacy of our tests on empirical examples chosen to reflect values observed in a standard simulation model of autonomous vehicles.
To move through the world, mobile robots typically use a receding-horizon strategy, wherein they execute an old plan while computing a new plan to incorporate new sensor information. A plan should be dynamically feasible, meaning it obeys constraints like the robot's dynamics and obstacle avoidance; it should have liveness, meaning the robot does not stop to plan so frequently that it cannot accomplish tasks; and it should be optimal, meaning that the robot tries to satisfy a user-specified cost function such as reaching a goal location as quickly as possible. Reachability-based Trajectory Design (RTD) is a planning method that can generate provably dynamically-feasible plans. However, RTD solves a nonlinear polynmial optimization program at each planning iteration, preventing optimality guarantees; furthermore, RTD can struggle with liveness because the robot must brake to a stop when the solver finds local minima or cannot find a feasible solution. This paper proposes RTD*, which certifiably finds the globally optimal plan (if such a plan exists) at each planning iteration. This method is enabled by a novel Parallelized Constrained Bernstein Algorithm (PCBA), which is a branch-and-bound method for polynomial optimization. The contributions of this paper are: the implementation of PCBA; proofs of bounds on the time and memory usage of PCBA; a comparison of PCBA to state of the art solvers; and the demonstration of PCBA/RTD* on a mobile robot. RTD* outperforms RTD in terms of optimality and liveness for real-time planning in a variety of environments with randomly-placed obstacles.