Abstract:Motion forecasts of road users (i.e., agents) vary in complexity as a function of scene constraints and interactive behavior. We address this with a multi-task learning method for motion forecasting that includes a retrocausal flow of information. The corresponding tasks are to forecast (1) marginal trajectory distributions for all modeled agents and (2) joint trajectory distributions for interacting agents. Using a transformer model, we generate the joint distributions by re-encoding marginal distributions followed by pairwise modeling. This incorporates a retrocausal flow of information from later points in marginal trajectories to earlier points in joint trajectories. Per trajectory point, we model positional uncertainty using compressed exponential power distributions. Notably, our method achieves state-of-the-art results in the Waymo Interaction Prediction dataset and generalizes well to the Argoverse 2 dataset. Additionally, our method provides an interface for issuing instructions through trajectory modifications. Our experiments show that regular training of motion forecasting leads to the ability to follow goal-based instructions and to adapt basic directional instructions to the scene context. Code: https://github.com/kit-mrt/future-motion
Abstract:Generative AI (GenAI) is rapidly advancing the field of Autonomous Driving (AD), extending beyond traditional applications in text, image, and video generation. We explore how generative models can enhance automotive tasks, such as static map creation, dynamic scenario generation, trajectory forecasting, and vehicle motion planning. By examining multiple generative approaches ranging from Variational Autoencoder (VAEs) over Generative Adversarial Networks (GANs) and Invertible Neural Networks (INNs) to Generative Transformers (GTs) and Diffusion Models (DMs), we highlight and compare their capabilities and limitations for AD-specific applications. Additionally, we discuss hybrid methods integrating conventional techniques with generative approaches, and emphasize their improved adaptability and robustness. We also identify relevant datasets and outline open research questions to guide future developments in GenAI. Finally, we discuss three core challenges: safety, interpretability, and realtime capabilities, and present recommendations for image generation, dynamic scenario generation, and planning.
Abstract:As cities strive to address urban mobility challenges, combining autonomous transportation technologies with intelligent infrastructure presents an opportunity to transform how people move within urban environments. Autonomous shuttles are particularly suited for adaptive and responsive public transport for the first and last mile, connecting with smart infrastructure to enhance urban transit. This paper presents the concept, implementation, and evaluation of a proof-of-concept deployment of an autonomous shuttle integrated with smart infrastructure at a public fair. The infrastructure includes two perception-equipped bus stops and a connected pedestrian intersection, all linked through a central communication and control hub. Our key contributions include the development of a comprehensive system architecture for "smart" bus stops, the integration of multiple urban locations into a cohesive smart transport ecosystem, and the creation of adaptive shuttle behavior for automated driving. Additionally, we publish an open source dataset and a Vehicle-to-X (V2X) driver to support further research. Finally, we offer an outlook on future research directions and potential expansions of the demonstrated technologies and concepts.
Abstract:Representing diverse and plausible future trajectories of actors is crucial for motion forecasting in autonomous driving. However, efficiently capturing the true trajectory distribution with a compact set is challenging. In this work, we propose a novel approach for generating scene-specific trajectory sets that better represent the diversity and admissibility of future actor behavior. Our method constructs multiple trajectory sets tailored to different scene contexts, such as intersections and non-intersections, by leveraging map information and actor dynamics. We introduce a deterministic goal sampling algorithm that identifies relevant map regions and generates trajectories conditioned on the scene layout. Furthermore, we empirically investigate various sampling strategies and set sizes to optimize the trade-off between coverage and diversity. Experiments on the Argoverse 2 dataset demonstrate that our scene-specific sets achieve higher plausibility while maintaining diversity compared to traditional single-set approaches. The proposed Recursive In-Distribution Subsampling (RIDS) method effectively condenses the representation space and outperforms metric-driven sampling in terms of trajectory admissibility. Our work highlights the benefits of scene-aware trajectory set generation for capturing the complex and heterogeneous nature of actor behavior in real-world driving scenarios.
Abstract:Accurately forecasting the motion of traffic actors is crucial for the deployment of autonomous vehicles at a large scale. Current trajectory forecasting approaches primarily concentrate on optimizing a loss function with a specific metric, which can result in predictions that do not adhere to physical laws or violate external constraints. Our objective is to incorporate explicit knowledge priors that allow a network to forecast future trajectories in compliance with both the kinematic constraints of a vehicle and the geometry of the driving environment. To achieve this, we introduce a non-parametric pruning layer and attention layers to integrate the defined knowledge priors. Our proposed method is designed to ensure reachability guarantees for traffic actors in both complex and dynamic situations. By conditioning the network to follow physical laws, we can obtain accurate and safe predictions, essential for maintaining autonomous vehicles' safety and efficiency in real-world settings.In summary, this paper presents concepts that prevent off-road predictions for safe and reliable motion forecasting by incorporating knowledge priors into the training process.
Abstract:Environmental perception obtained via object detectors have no predictable safety layer encoded into their model schema, which creates the question of trustworthiness about the system's prediction. As can be seen from recent adversarial attacks, most of the current object detection networks are vulnerable to input tampering, which in the real world could compromise the safety of autonomous vehicles. The problem would be amplified even more when uncertainty errors could not propagate into the submodules, if these are not a part of the end-to-end system design. To address these concerns, a parallel module which verifies the predictions of the object proposals coming out of Deep Neural Networks are required. This work aims to verify 3D object proposals from MonoRUn model by proposing a plausibility framework that leverages cross sensor streams to reduce false positives. The verification metric being proposed uses prior knowledge in the form of four different energy functions, each utilizing a certain prior to output an energy value leading to a plausibility justification for the hypothesis under consideration. We also employ a novel two-step schema to improve the optimization of the composite energy function representing the energy model.
Abstract:The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.