Abstract:Localization is a critical technology in autonomous driving, encompassing both topological localization, which identifies the most similar map keyframe to the current observation, and metric localization, which provides precise spatial coordinates. Conventional methods typically address these tasks independently, rely on single-camera setups, and often require additional 3D semantic or pose priors, while lacking mechanisms to quantify the confidence of localization results, making them less feasible for real industrial applications. In this paper, we propose VVLoc, a unified pipeline that employs a single neural network to concurrently achieve topological and metric vehicle localization using multi-camera system. VVLoc first evaluates the geo-proximity between visual observations, then estimates their relative metric poses using a matching strategy, while also providing a confidence measure. Additionally, the training process for VVLoc is highly efficient, requiring only pairs of visual data and corresponding ground-truth poses, eliminating the need for complex supplementary data. We evaluate VVLoc not only on the publicly available datasets, but also on a more challenging self-collected dataset, demonstrating its ability to deliver state-of-the-art localization accuracy across a wide range of localization tasks.




Abstract:Autonomous driving vehicles (ADVs) hold great hopes to solve traffic congestion problems and reduce the number of traffic accidents. Accurate trajectories prediction of other traffic agents around ADVs is of key importance to achieve safe and efficient driving. Pedestrians, particularly, are more challenging to forecast due to their complex social in-teractions and randomly moving patterns. We propose a Residual Graph Convolutional Neural Network (Res-GCNN), which models the interactive behaviors of pedes-trians by using the adjacent matrix of the constructed graph for the current scene. Though the proposed Res-GCNN is quite lightweight with only about 6.4 kilo parameters which outperforms all other methods in terms of parameters size, our experimental results show an improvement over the state of art by 13.3% on the Final Displacement Error (FDE) which reaches 0.65 meter. As for the Average Dis-placement Error (ADE), we achieve a suboptimal result (the value is 0.37 meter), which is also very competitive. The Res-GCNN is evaluated in the platform with an NVIDIA GeForce RTX1080Ti GPU, and its mean inference time of the whole dataset is only about 2.2 microseconds. Compared with other methods, the proposed method shows strong potential for onboard application accounting for forecasting accuracy and time efficiency. The code will be made publicly available on GitHub.