The commonly used metrics for motion prediction do not correlate well with a self-driving vehicle's system-level performance. The most common metrics are average displacement error (ADE) and final displacement error (FDE), which omit many features, making them poor self-driving performance indicators. Since high-fidelity simulations and track testing can be resource-intensive, the use of prediction metrics better correlated with full-system behavior allows for swifter iteration cycles. In this paper, we offer a conceptual framework for prediction evaluation highly specific to self-driving. We propose two complementary metrics that quantify the effects of motion prediction on safety (related to recall) and comfort (related to precision). Using a simulator, we demonstrate that our safety metric has a significantly better signal-to-noise ratio than displacement error in identifying unsafe events.
Behavior prediction of traffic actors is an essential component of any real-world self-driving system. Actors' long-term behaviors tend to be governed by their interactions with other actors or traffic elements (traffic lights, stop signs) in the scene. To capture this highly complex structure of interactions, we propose to use a hybrid graph whose nodes represent both the traffic actors as well as the static and dynamic traffic elements present in the scene. The different modes of temporal interaction (e.g., stopping and going) among actors and traffic elements are explicitly modeled by graph edges. This explicit reasoning about discrete interaction types not only helps in predicting future motion, but also enhances the interpretability of the model, which is important for safety-critical applications such as autonomous driving. We predict actors' trajectories and interaction types using a graph neural network, which is trained in a semi-supervised manner. We show that our proposed model, TrafficGraphNet, achieves state-of-the-art trajectory prediction accuracy while maintaining a high level of interpretability.
We present a new method for multi-modal, long-term vehicle trajectory prediction. Our approach relies on using lane centerlines captured in rich maps of the environment to generate a set of proposed goal paths for each vehicle. Using these paths -- which are generated at run time and therefore dynamically adapt to the scene -- as spatial anchors, we predict a set of goal-based trajectories along with a categorical distribution over the goals. This approach allows us to directly model the goal-directed behavior of traffic actors, which unlocks the potential for more accurate long-term prediction. Our experimental results on both a large-scale internal driving dataset and on the public nuScenes dataset show that our model outperforms state-of-the-art approaches for vehicle trajectory prediction over a 6-second horizon. We also empirically demonstrate that our model is better able to generalize to road scenes from a completely new city than existing methods.
Predicting the possible future behaviors of vehicles that drive on shared roads is a crucial task for safe autonomous driving. Many existing approaches to this problem strive to distill all possible vehicle behaviors into a simplified set of high-level actions. However, these action categories do not suffice to describe the full range of maneuvers possible in the complex road networks we encounter in the real world. To combat this deficiency, we propose a new method that leverages the mapped road topology to reason over possible goals and predict the future spatial occupancy of dynamic road actors. We show that our approach is able to accurately predict future occupancy that remains consistent with the mapped lane geometry and naturally captures multi-modality based on the local scene context while also not suffering from the mode collapse problem observed in prior work.
Motion prediction of surrounding vehicles is one of the most important tasks handled by a self-driving vehicle, and represents a critical step in the autonomous system necessary to ensure safety for all the involved traffic actors. Recently a number of researchers from both academic and industrial community focused on this important problem, proposing ideas ranging from engineered, rule-based methods to learned approaches, shown to perform well at different prediction horizons. In particular, while for longer-term trajectories the engineered methods outperform the competing approaches, the learned methods have proven to be the best choice at short-term horizons. In this work we describe how to overcome the discrepancy between these two research directions, and propose a method that combines the disparate approaches under a single unifying framework. The resulting algorithm fuses learned, uncertainty-aware trajectories with lane-based paths in a principled manner, resulting in improved prediction accuracy at both shorter- and longer-term horizons. Experiments on real-world, large-scale data strongly suggest benefits of the proposed unified method, which outperformed the existing state-of-the-art. Moreover, following offline evaluation the proposed method was successfully tested onboard a self-driving vehicle.