Human behavior prediction models enable robots to anticipate how humans may react to their actions, and hence are instrumental to devising safe and proactive robot planning algorithms. However, modeling complex interaction dynamics and capturing the possibility of many possible outcomes in such interactive settings is very challenging, which has recently prompted the study of several different approaches. In this work, we provide a self-contained tutorial on a conditional variational autoencoder (CVAE) approach to human behavior prediction which, at its core, can produce a multimodal probability distribution over future human trajectories conditioned on past interactions and candidate robot future actions. Specifically, the goals of this tutorial paper are to review and build a taxonomy of state-of-the-art methods in human behavior prediction, from physics-based to purely data-driven methods, provide a rigorous yet easily accessible description of a data-driven, CVAE-based approach, highlight important design characteristics that make this an attractive model to use in the context of model-based planning for human-robot interactions, and provide important design considerations when using this class of models.
Sampling-based motion planning techniques have emerged as an efficient algorithmic paradigm for solving complex motion planning problems. These approaches use a set of probing samples to construct an implicit graph representation of the robot's state space, allowing arbitrarily accurate representations as the number of samples increases to infinity. In practice, however, solution trajectories only rely on a few critical states, often defined by structure in the state space (e.g., doorways). In this work we propose a general method to identify these critical states via graph-theoretic techniques (betweenness centrality) and learn to predict criticality from only local environment features. These states are then leveraged more heavily via global connections within a hierarchical graph, termed Critical Probabilistic Roadmaps. Critical PRMs are demonstrated to achieve up to three orders of magnitude improvement over uniform sampling, while preserving the guarantees and complexity of sampling-based motion planning. A video is available at https://youtu.be/AYoD-pGd9ms.
RRT* is one of the most widely used sampling-based algorithms for asymptotically-optimal motion planning. This algorithm laid the foundations for optimality in motion planning as a whole, and inspired the development of numerous new algorithms in the field, many of which build upon RRT* itself. In this paper, we first identify a logical gap in the optimality proof of RRT*, which was developed in Karaman and Frazzoli (2011). Then, we present an alternative and mathematically-rigorous proof for asymptotic optimality. Our proof suggests that the connection radius used by RRT* should be increased from $\gamma \left(\frac{\log n}{n}\right)^{1/d}$ to $\gamma' \left(\frac{\log n}{n}\right)^{1/(d+1)}$ in order to account for the additional dimension of time that dictates the samples' ordering. Here $\gamma$, $\gamma'$, are constants, and $n$, $d$, are the number of samples and the dimension of the problem, respectively.
Action anticipation, intent prediction, and proactive behavior are all desirable characteristics for autonomous driving policies in interactive scenarios. Paramount, however, is ensuring safety on the road --- a key challenge in doing so is accounting for uncertainty in human driver actions without unduly impacting planner performance. This paper introduces a minimally-interventional safety controller operating within an autonomous vehicle control stack with the role of ensuring collision-free interaction with an externally controlled (e.g., human-driven) counterpart. We leverage reachability analysis to construct a real-time (100Hz) controller that serves the dual role of (1) tracking an input trajectory from a higher-level planning algorithm using model predictive control, and (2) assuring safety through maintaining the availability of a collision-free escape maneuver as a persistent constraint regardless of whatever future actions the other car takes. A full-scale steer-by-wire platform is used to conduct traffic weaving experiments wherein the two cars, initially side-by-side, must swap lanes in a limited amount of time and distance, emulating cars merging onto/off of a highway. We demonstrate that, with our control stack, the autonomous vehicle is able to avoid collision even when the other car defies the planner's expectations and takes dangerous actions, either carelessly or with the intent to collide, and otherwise deviates minimally from the planned trajectory to the extent required to maintain safety.
This work presents a methodology for modeling and predicting human behavior in settings with N humans interacting in highly multimodal scenarios (i.e. where there are many possible highly-distinct futures). A motivating example includes robots interacting with humans in crowded environments, such as self-driving cars operating alongside human-driven vehicles or human-robot collaborative bin packing in a warehouse. Our approach to model human behavior in such uncertain environments is to model humans in the scene as nodes in a graphical model, with edges encoding relationships between them. For each human, we learn a multimodal probability distribution over future actions from a dataset of multi-human interactions. Learning such distributions is made possible by recent advances in the theory of conditional variational autoencoders and deep learning approximations of probabilistic graphical models. Specifically, we learn action distributions conditioned on interaction history, neighboring human behavior, and candidate future agent behavior in order to take into account response dynamics. We demonstrate the performance of such a modeling approach in modeling basketball player trajectories, a highly multimodal, multi-human scenario which serves as a proxy for many robotic applications.
In this paper we describe a framework towards computing well-localized, robust motion plans through the perception-aware motion planning problem, whereby we seek a low-cost motion plan subject to a separate constraint on perception localization quality. To solve this problem we introduce the Multiobjective Perception-Aware Planning (MPAP) algorithm which explores the state space via a multiobjective search, considering both cost and a perception heuristic. This framework can accommodate a large range of heuristics, allowing those that capture the history dependence of localization drift and represent complex modern perception methods. We present two such heuristics, one derived from a simplified model of robot perception and a second learned from ground-truth sensor error, which we show to be capable of predicting the performance of a state-of-the-art perception system. The solution trajectory from this heuristic-based search is then certified via Monte Carlo methods to be well-localized and robust. The additional computational burden of perception-aware planning is offset by GPU massive parallelization. Through numerical experiments the algorithm is shown to find well-localized, robust solutions in about a second. Finally, we demonstrate MPAP on a quadrotor flying perception-aware and perception-agnostic plans using Google Tango for localization, finding the quadrotor safely executes the perception-aware plan every time, while crashing in over 20% of the perception-agnostic runs due to loss of localization.
This paper presents a method for constructing human-robot interaction policies in settings where multimodality, i.e., the possibility of multiple highly distinct futures, plays a critical role in decision making. We are motivated in this work by the example of traffic weaving, e.g., at highway on-ramps/off-ramps, where entering and exiting cars must swap lanes in a short distance---a challenging negotiation even for experienced drivers due to the inherent multimodal uncertainty of who will pass whom. Our approach is to learn multimodal probability distributions over future human actions from a dataset of human-human exemplars and perform real-time robot policy construction in the resulting environment model through massively parallel sampling of human responses to candidate robot action sequences. Direct learning of these distributions is made possible by recent advances in the theory of conditional variational autoencoders (CVAEs), whereby we learn action distributions simultaneously conditioned on the present interaction history, as well as candidate future robot actions in order to take into account response dynamics. We demonstrate the efficacy of this approach with a human-in-the-loop simulation of a traffic weaving scenario.
This paper presents a tool for addressing a key component in many algorithms for planning robot trajectories under uncertainty: evaluation of the safety of a robot whose actions are governed by a closed-loop feedback policy near a nominal planned trajectory. We describe an adaptive importance sampling Monte Carlo framework that enables the evaluation of a given control policy for satisfaction of a probabilistic collision avoidance constraint which also provides an associated certificate of accuracy (in the form of a confidence interval). In particular this adaptive technique is well-suited to addressing the complexities of rigid-body collision checking applied to non-linear robot dynamics. As a Monte Carlo method it is amenable to parallelization for computational tractability, and is generally applicable to a wide gamut of simulatable systems, including alternative noise models. Numerical experiments demonstrating the effectiveness of the adaptive importance sampling procedure are presented and discussed.
This paper presents a novel approach, named the Group Marching Tree (GMT*) algorithm, to planning on GPUs at rates amenable to application within control loops, allowing planning in real-world settings via repeated computation of near-optimal plans. GMT*, like the Fast Marching Tree (FMT) algorithm, explores the state space with a "lazy" dynamic programming recursion on a set of samples to grow a tree of near-optimal paths. GMT*, however, alters the approach of FMT with approximate dynamic programming by expanding, in parallel, the group of all active samples with cost below an increasing threshold, rather than only the minimum cost sample. This group approximation enables low-level parallelism over the sample set and removes the need for sequential data structures, while the "lazy" collision checking limits thread divergence---all contributing to a very efficient GPU implementation. While this approach incurs some suboptimality, we prove that GMT* remains asymptotically optimal up to a constant multiplicative factor. We show solutions for complex planning problems under differential constraints can be found in ~10 ms on a desktop GPU and ~30 ms on an embedded GPU, representing a significant speed up over the state of the art, with only small losses in performance. Finally, we present a scenario demonstrating the efficacy of planning within the control loop (~100 Hz) towards operating in dynamic, uncertain settings.
In this paper we present the PUMP (Parallel Uncertainty-aware Multiobjective Planning) algorithm for addressing the stochastic kinodynamic motion planning problem, whereby one seeks a low-cost, dynamically-feasible motion plan subject to a constraint on collision probability (CP). To ensure exhaustive evaluation of candidate motion plans (as needed to tradeoff the competing objectives of performance and safety), PUMP incrementally builds the Pareto front of the problem, accounting for the optimization objective and an approximation of CP. This is performed by a massively parallel multiobjective search, here implemented with a focus on GPUs. Upon termination of the exploration phase, PUMP searches the Pareto set of motion plans to identify the lowest cost solution that is certified to satisfy the CP constraint (according to an asymptotically exact estimator). We introduce a novel particle-based CP approximation scheme, designed for efficient GPU implementation, which accounts for dependencies over the history of a trajectory execution. We present numerical experiments for quadrotor planning wherein PUMP identifies solutions in ~100 ms, evaluating over one hundred thousand partial plans through the course of its exploration phase. The results show that this multiobjective search achieves a lower motion plan cost, for the same CP constraint, compared to a safety buffer-based search heuristic and repeated RRT trials.