Belief propagation (BP) is a useful probabilistic inference algorithm for efficiently computing approximate marginal probability densities of random variables. However, in its standard form, BP is applicable to only the vector-type random variables, while certain applications rely on set-type random variables with an unknown number of vector elements. In this paper, we first develop BP rules for set-type random variables and demonstrate that vector-type BP is a special case of set-type BP. We further propose factor graphs with set-factor and set-variable nodes by devising the set-factor nodes that can address the set-variables with random elements and cardinality, while the number of vector elements in vector-type is known. To demonstrate the validity of developed set-type BP, we apply it to the Poisson multi-Bernoulli (PMB) filter for simultaneous localization and mapping (SLAM), which naturally leads to a new set-type BP-SLAM filter. Finally, we reveal connections between the vector-type BP-SLAM filter and the proposed set-type BP-SLAM filter and show a performance gain of the proposed set-type BP-SLAM filter in comparison with the vector-type BP-SLAM filter.
Positioning with 5G signals generally requires connection to several base stations (BSs), which makes positioning more demanding in terms of infrastructure than communications. To address this issue, there have been several theoretical studies on single BS positioning, leveraging high-resolution angle and delay estimation and multipath exploitation possibilities at mmWave frequencies. This paper presents the first realistic experimental validation of such studies, involving a commercial 5G mmWave BS and a user equipment (UE) development kit mounted on a test vehicle. We present the relevant signal models, signal processing methods (including channel parameter estimation and position estimation), and validate these based on real data collected in an outdoor science park environment. Our results indicate that positioning is possible, but the performance with a single BS is limited by the knowledge of the position and orientation of the infrastructure and the multipath visibility and diversity.
Open-set semi-supervised learning (OSSL) is a realistic setting of semi-supervised learning where the unlabeled training set contains classes that are not present in the labeled set. Many existing OSSL methods assume that these out-of-distribution data are harmful and put effort into excluding data from unknown classes from the training objective. In contrast, we propose an OSSL framework that facilitates learning from all unlabeled data through self-supervision. Additionally, we utilize an energy-based score to accurately recognize data belonging to the known classes, making our method well-suited for handling uncurated data in deployment. We show through extensive experimental evaluations on several datasets that our method shows overall unmatched robustness and performance in terms of closed-set accuracy and open-set recognition compared with state-of-the-art for OSSL. Our code will be released upon publication.
Research connecting text and images has recently seen several breakthroughs, with models like CLIP, DALL-E 2, and Stable Diffusion. However, the connection between text and other visual modalities, such as lidar data, has received less attention, prohibited by the lack of text-lidar datasets. In this work, we propose LidarCLIP, a mapping from automotive point clouds to a pre-existing CLIP embedding space. Using image-lidar pairs, we supervise a point cloud encoder with the image CLIP embeddings, effectively relating text and lidar data with the image domain as an intermediary. We show the effectiveness of LidarCLIP by demonstrating that lidar-based retrieval is generally on par with image-based retrieval, but with complementary strengths and weaknesses. By combining image and lidar features, we improve upon both single-modality methods and enable a targeted search for challenging detection scenarios under adverse sensor conditions. We also use LidarCLIP as a tool to investigate fundamental lidar capabilities through natural language. Finally, we leverage our compatibility with CLIP to explore a range of applications, such as point cloud captioning and lidar-to-image generation, without any additional training. We hope LidarCLIP can inspire future work to dive deeper into connections between text and point cloud understanding. Code and trained models available at https://github.com/atonderski/lidarclip.
Device localization and radar-like mapping are at the heart of integrated sensing and communication, enabling not only new services and applications, but can also improve communication quality with reduced overheads. These forms of sensing are however susceptible to data association problems, due to the unknown relation between measurements and detected objects or targets. In this chapter, we provide an overview of the fundamental tools used to solve mapping, tracking, and simultaneous localization and mapping (SLAM) problems. We distinguish the different types of sensing problems and then focus on mapping and SLAM as running examples. Starting from the applicable models and definitions, we describe the different algorithmic approaches, with a particular focus on how to deal with data association problems. In particular, methods based on random finite set theory and Bayesian graphical models are introduced in detail. A numerical study with synthetic and experimental data is then used to compare these approaches in a variety of scenarios.
In this paper, we demonstrate that deep learning based method can be used to fuse multi-object densities. Given a scenario with several sensors with possibly different field-of-views, tracking is performed locally in each sensor by a tracker, which produces random finite set multi-object densities. To fuse outputs from different trackers, we adapt a recently proposed transformer-based multi-object tracker, where the fusion result is a global multi-object density, describing the set of all alive objects at the current time. We compare the performance of the transformer-based fusion method with a well-performing model-based Bayesian fusion method in several simulated scenarios with different parameter settings using synthetic data. The simulation results show that the transformer-based fusion method outperforms the model-based Bayesian method in our experimental scenarios.
In this paper, we demonstrate that deep learning based method can be used to fuse multi-object densities. Given a scenario with several sensors with possibly different field-of-views, tracking is performed locally in each sensor by a tracker, which produces random finite set multi-object densities. To fuse outputs from different trackers, we adapt a recently proposed transformer-based multi-object tracker, where the fusion result is a global multi-object density, describing the set of all alive objects at the current time. We compare the performance of the transformer-based fusion method with a well-performing model-based Bayesian fusion method in several simulated scenarios with different parameter settings using synthetic data. The simulation results show that the transformer-based fusion method outperforms the model-based Bayesian method in our experimental scenarios.
Networks in 5G and beyond utilize millimeter wave (mmWave) radio signals, large bandwidths, and large antenna arrays, which bring opportunities in jointly localizing the user equipment and mapping the propagation environment, termed as simultaneous localization and mapping (SLAM). Existing approaches mainly rely on delays and angles, and ignore the Doppler, although it contains geometric information. In this paper, we study the benefits of exploiting Doppler in SLAM through deriving the posterior Cram\'er-Rao bounds (PCRBs) and formulating the extended Kalman-Poisson multi-Bernoulli sequential filtering solution with Doppler as one of the involved measurements. Both theoretical PCRB analysis and simulation results demonstrate the efficacy of utilizing Doppler.
In this paper, we propose a Poisson multi-Bernoulli (PMB) filter for extended object tracking (EOT), which directly estimates the set of object trajectories, using belief propagation (BP). The proposed filter propagates a PMB density on the posterior of sets of trajectories through the filtering recursions over time, where the PMB mixture (PMBM) posterior after the update step is approximated as a PMB. The efficient PMB approximation relies on several important theoretical contributions. First, we present a PMBM conjugate prior on the posterior of sets of trajectories for a generalized measurement model, in which each object generates an independent set of measurements. The PMBM density is a conjugate prior in the sense that both the prediction and the update steps preserve the PMBM form of the density. Second, we present a factor graph representation of the joint posterior of the PMBM set of trajectories and association variables for the Poisson spatial measurement model. Importantly, leveraging the PMBM conjugacy and the factor graph formulation enables an elegant treatment on undetected objects via a Poisson point process and efficient inference on sets of trajectories using BP, where the approximate marginal densities in the PMB approximation can be obtained without enumeration of different data association hypotheses. To achieve this, we present a particle-based implementation of the proposed filter, where smoothed trajectory estimates, if desired, can be obtained via single-object particle smoothing methods, and its performance for EOT with ellipsoidal shapes is evaluated in a simulation study.
This paper provides a comparative analysis between the adaptive birth model used in the labelled random finite set literature and the track initiation in the Poisson multi-Bernoulli mixture (PMBM) filter, with point-target models. The PMBM track initiation is obtained via Bayes' rule applied on the predicted PMBM density, and creates one Bernoulli component for each received measurement, representing that this measurement may be clutter or a detection from a new target. Adaptive birth mimics this procedure by creating a Bernoulli component for each measurement using a different rule to determine the probability of existence and a user-defined single-target density. This paper first provides an analysis of the differences that arise in track initiation based on isolated measurements. Then, it shows that adaptive birth underestimates the number of objects present in the surveillance area under common modelling assumptions. Finally, we provide numerical simulations to further illustrate the differences.