Abstract:The probability hypothesis density (PHD) and Poisson multi-Bernoulli (PMB) filters are two popular set-type multi-object filters. Motivated by the fact that the multi-object filtering density after each update step in the PHD filter is a PMB without approximation, in this paper we present a multi-object smoother involving PHD forward filtering and PMB backward smoothing. This is achieved by first running the PHD filtering recursion in the forward pass and extracting the PMB filtering densities after each update step before the Poisson Point Process approximation, which is inherent in the PHD filter update. Then in the backward pass we apply backward simulation for sets of trajectories to the extracted PMB filtering densities. We call the resulting multi-object smoother hybrid PHD-PMB trajectory smoother. Notably, the hybrid PHD-PMB trajectory smoother can provide smoothed trajectory estimates for the PHD filter without labeling or tagging, which is not possible for existing PHD smoothers. Also, compared to the trajectory PHD filter, which can only estimate alive trajectories, the hybrid PHD-PMB trajectory smoother enables the estimation of the set of all trajectories. Simulation results demonstrate that the hybrid PHD-PMB trajectory smoother outperforms the PHD filter in terms of both state and cardinality estimates, and the trajectory PHD filter in terms of false detections.
Abstract:Simultaneous localization and mapping (SLAM) methods need to both solve the data association (DA) problem and the joint estimation of the sensor trajectory and the map, conditioned on a DA. In this paper, we propose a novel integrated approach to solve both the DA problem and the batch SLAM problem simultaneously, combining random finite set (RFS) theory and the graph-based SLAM approach. A sampling method based on the Poisson multi-Bernoulli mixture (PMBM) density is designed for dealing with the DA uncertainty, and a graph-based SLAM solver is applied for the conditional SLAM problem. In the end, a post-processing approach is applied to merge SLAM results from different iterations. Using synthetic data, it is demonstrated that the proposed SLAM approach achieves performance close to the posterior Cram\'er-Rao bound, and outperforms state-of-the-art RFS-based SLAM filters in high clutter and high process noise scenarios.
Abstract:Accurate and timely determination of a vehicle's current lane within a map is a critical task in autonomous driving systems. This paper utilizes an Early Time Series Classification (ETSC) method to achieve precise and rapid ego-lane identification in real-world driving data. The method begins by assessing the similarities between map and lane markings perceived by the vehicle's camera using measurement model quality metrics. These metrics are then fed into a selected ETSC method, comprising a probabilistic classifier and a tailored trigger function, optimized via multi-objective optimization to strike a balance between early prediction and accuracy. Our solution has been evaluated on a comprehensive dataset consisting of 114 hours of real-world traffic data, collected across 5 different countries by our test vehicles. Results show that by leveraging road lane-marking geometry and lane-marking type derived solely from a camera, our solution achieves an impressive accuracy of 99.6%, with an average prediction time of only 0.84 seconds.
Abstract:High-definition map with accurate lane-level information is crucial for autonomous driving, but the creation of these maps is a resource-intensive process. To this end, we present a cost-effective solution to create lane-level roadmaps using only the global navigation satellite system (GNSS) and a camera on customer vehicles. Our proposed solution utilizes a prior standard-definition (SD) map, GNSS measurements, visual odometry, and lane marking edge detection points, to simultaneously estimate the vehicle's 6D pose, its position within a SD map, and also the 3D geometry of traffic lines. This is achieved using a Bayesian simultaneous localization and multi-object tracking filter, where the estimation of traffic lines is formulated as a multiple extended object tracking problem, solved using a trajectory Poisson multi-Bernoulli mixture (TPMBM) filter. In TPMBM filtering, traffic lines are modeled using B-spline trajectories, and each trajectory is parameterized by a sequence of control points. The proposed solution has been evaluated using experimental data collected by a test vehicle driving on highway. Preliminary results show that the traffic line estimates, overlaid on the satellite image, generally align with the lane markings up to some lateral offsets.
Abstract:Roadside perception is a key component in intelligent transportation systems. In this paper, we present a novel three-dimensional (3D) extended object tracking (EOT) method, which simultaneously estimates the object kinematics and extent state, in roadside perception using both the radar and camera data. Because of the influence of sensor viewing angle and limited angle resolution, radar measurements from objects are sparse and non-uniformly distributed, leading to inaccuracies in object extent and position estimation. To address this problem, we present a novel spherical Gaussian function weighted Gaussian mixture model. This model assumes that radar measurements originate from a series of probabilistic weighted radar reflectors on the vehicle's extent. Additionally, we utilize visual detection of vehicle keypoints to provide additional information on the positions of radar reflectors. Since keypoints may not always correspond to radar reflectors, we propose an elastic skeleton fusion mechanism, which constructs a virtual force to establish the relationship between the radar reflectors on the vehicle and its extent. Furthermore, to better describe the kinematic state of the vehicle and constrain its extent state, we develop a new 3D constant turn rate and velocity motion model, considering the complex 3D motion of the vehicle relative to the roadside sensor. Finally, we apply variational Bayesian approximation to the intractable measurement update step to enable recursive Bayesian estimation of the object's state. Simulation results using the Carla simulator and experimental results on the nuScenes dataset demonstrate the effectiveness and superiority of the proposed method in comparison to several state-of-the-art 3D EOT methods.
Abstract:Multiple extended target tracking (ETT) has gained increasing attention due to the development of high-precision LiDAR and radar sensors in automotive applications. For LiDAR point cloud-based vehicle tracking, this paper presents a probabilistic measurement-region association (PMRA) ETT model, which can describe the complex measurement distribution by partitioning the target extent into different regions. The PMRA model overcomes the drawbacks of previous data-region association (DRA) models by eliminating the approximation error of constrained estimation and using continuous integrals to more reliably calculate the association probabilities. Furthermore, the PMRA model is integrated with the Poisson multi-Bernoulli mixture (PMBM) filter for tracking multiple vehicles. Simulation results illustrate the superior estimation accuracy of the proposed PMRA-PMBM filter in terms of both positions and extents of the vehicles comparing with PMBM filters using the gamma Gaussian inverse Wishart and DRA implementations.
Abstract:Few-shot segmentation aims to train a segmentation model that can fast adapt to a novel task for which only a few annotated images are provided. Most recent models have adopted a prototype-based paradigm for few-shot inference. These approaches may have limited generalization capacity beyond the standard 1- or 5-shot settings. In this paper, we closely examine and reevaluate the fine-tuning based learning scheme that fine-tunes the classification layer of a deep segmentation network pre-trained on diverse base classes. To improve the generalizability of the classification layer optimized with sparsely annotated samples, we introduce an instance-aware data augmentation (IDA) strategy that augments the support images based on the relative sizes of the target objects. The proposed IDA effectively increases the support set's diversity and promotes the distribution consistency between support and query images. On the other hand, the large visual difference between query and support images may hinder knowledge transfer and cripple the segmentation performance. To cope with this challenge, we introduce the local consensus guided cross attention (LCCA) to align the query feature with support features based on their dense correlation, further improving the model's generalizability to the query image. The significant performance improvements on the standard few-shot segmentation benchmarks PASCAL-$5^i$ and COCO-$20^i$ verify the efficacy of our proposed method.
Abstract:Multi-object tracking (MOT) is the task of estimating the state trajectories of an unknown and time-varying number of objects over a certain time window. Several algorithms have been proposed to tackle the multi-object smoothing task, where object detections can be conditioned on all the measurements in the time window. However, the best-performing methods suffer from intractable computational complexity and require approximations, performing suboptimally in complex settings. Deep learning based algorithms are a possible venue for tackling this issue but have not been applied extensively in settings where accurate multi-object models are available and measurements are low-dimensional. We propose a novel DL architecture specifically tailored for this setting that decouples the data association task from the smoothing task. We compare the performance of the proposed smoother to the state-of-the-art in different tasks of varying difficulty and provide, to the best of our knowledge, the first comparison between traditional Bayesian trackers and DL trackers in the smoothing problem setting.
Abstract:This paper considers a batch solution to the multi-object tracking problem based on sets of trajectories. Specifically, we present two offline implementations of the trajectory Poisson multi-Bernoulli mixture (TPMBM) filter for batch data based on Markov chain Monte Carlo (MCMC) sampling of the data association hypotheses. In contrast to online TPMBM implementations, the proposed offline implementations solve a large-scale, multi-scan data association problem across the entire time interval of interest, and therefore they can fully exploit all the measurement information available. Furthermore, by leveraging the efficient hypothesis structure of TPMBM filters, the proposed implementations compare favorably with other MCMC-based multi-object tracking algorithms. Simulation results show that the TPMBM implementation using the Metropolis-Hastings algorithm presents state-of-the-art multiple trajectory estimation performance.
Abstract:Accurate perception is essential for advancing autonomous driving and addressing safety challenges in modern transportation systems. Despite significant advancements in computer vision for object recognition, current perception methods still face difficulties in complex real-world traffic environments. Challenges such as physical occlusion and limited sensor field of view persist for individual vehicle systems. Cooperative Perception (CP) with Vehicle-to-Everything (V2X) technologies has emerged as a solution to overcome these obstacles and enhance driving automation systems. While some research has explored CP's fundamental architecture and critical components, there remains a lack of comprehensive summaries of the latest innovations, particularly in the context of V2X communication technologies. To address this gap, this paper provides a comprehensive overview of the evolution of CP technologies, spanning from early explorations to recent developments, including advancements in V2X communication technologies. Additionally, a contemporary generic framework is proposed to illustrate the V2X-based CP workflow, aiding in the structured understanding of CP system components. Furthermore, this paper categorizes prevailing V2X-based CP methodologies based on the critical issues they address. An extensive literature review is conducted within this taxonomy, evaluating existing datasets and simulators. Finally, open challenges and future directions in CP for autonomous driving are discussed by considering both perception and V2X communication advancements.