Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kourosh Khoshelham

PC2Model: ISPRS benchmark on 3D point cloud to model registration

Apr 21, 2026

Mehdi Maboudi, Said Harb, Jackson Ferrao, Kourosh Khoshelham, Yelda Turkan, Karam Mawas

Abstract:Point cloud registration involves aligning one point cloud with another or with a three-dimensional (3D) model, enabling the integration of multimodal data into a unified representation. This is essential in applications such as construction monitoring, autonomous driving, robotics, and virtual or augmented reality (VR/AR).With the increasing accessibility of point cloud acquisition technologies, such as Light Detection and Ranging (LiDAR) and structured light scanning, along with recent advances in deep learning, the research focus has increasingly shifted towards downstream tasks, particularly point cloud-to-model (PC2Model) registration. While data-driven methods aim to automate this process, they struggle with sparsity, noise, clutter, and occlusions in real-world scans, which limit their performance. To address these challenges, this paper introduces the PC2Model benchmark, a publicly available dataset designed to support the training and evaluation of both classical and data-driven methods. Developed under the leadership of ICWG II/Ib, the PC2Model benchmark adopts a hybrid design that combines simulated point clouds with, in some cases, real-world scans and their corresponding 3D models. Simulated data provide precise ground truth and controlled conditions, while real-world data introduce sensor and environmental artefacts. This design supports robust training and evaluation across domains and enables the systematic analysis of model transferability from simulated to real-world scenarios. The dataset is publicly accessible at: https://zenodo.org/uploads/17581812.

* ISPRS Congress 2026, Toronto

Via

Access Paper or Ask Questions

Neural Distribution Prior for LiDAR Out-of-Distribution Detection

Apr 10, 2026

Zizhao Li, Zhengkang Xiang, Jiayang Ao, Feng Liu, Joseph West, Kourosh Khoshelham

Abstract:LiDAR-based perception is critical for autonomous driving due to its robustness to poor lighting and visibility conditions. Yet, current models operate under the closed-set assumption and often fail to recognize unexpected out-of-distribution (OOD) objects in the open world. Existing OOD scoring functions exhibit limited performance because they ignore the pronounced class imbalance inherent in LiDAR OOD detection and assume a uniform class distribution. To address this limitation, we propose the Neural Distribution Prior (NDP), a framework that models the distributional structure of network predictions and adaptively reweights OOD scores based on alignment with a learned distribution prior. NDP dynamically captures the logit distribution patterns of training data and corrects class-dependent confidence bias through an attention-based module. We further introduce a Perlin noise-based OOD synthesis strategy that generates diverse auxiliary OOD samples from input scans, enabling robust OOD training without external datasets. Extensive experiments on the SemanticKITTI and STU benchmarks demonstrate that NDP substantially improves OOD detection performance, achieving a point-level AP of 61.31\% on the STU test set, which is more than 10$\times$ higher than the previous best result. Our framework is compatible with various existing OOD scoring formulations, providing an effective solution for open-world LiDAR perception.

* CVPR 2026

Via

Access Paper or Ask Questions

Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection

Apr 05, 2026

Xueyang Kang, Zizhao Li, Tian Lan, Dong Gong, Kourosh Khoshelham, Liangliang Nan

Abstract:3D shape anomaly detection is a crucial task for industrial inspection and geometric analysis. Existing deep learning approaches typically learn representations of normal shapes and identify anomalies via out-of-distribution feature detection or decoder-based reconstruction. They often fail to generalize across diverse anomaly types and scales, such as global geometric errors (e.g., planar shifts, angle misalignments), and are sensitive to noisy or incomplete local points during training. To address these limitations, we propose a hierarchical point-patch anomaly scoring network that jointly models regional part features and local point features for robust anomaly reasoning. An adaptive patchification module integrates self-supervised decomposition to capture complex structural deviations. Beyond evaluations on public benchmarks (Anomaly-ShapeNet and Real3D-AD), we release an industrial test set with real CAD models exhibiting planar, angular, and structural defects. Experiments on public and industrial datasets show superior AUC-ROC and AUC-PR performance, including over 40% point-level improvement on the new industrial anomaly type and average object-level gains of 7% on Real3D-AD and 4% on Anomaly-ShapeNet, demonstrating strong robustness and generalization.

* CVPR 2026
* 10 pages, 5 figures, 6 tables

Via

Access Paper or Ask Questions

Direct Estimation of Tree Volume and Aboveground Biomass Using Deep Regression with Synthetic Lidar Data

Mar 04, 2026

Habib Pourdelan, Zhengkang Xiang, Hugh Stewart, Cam Nicholson, Martin Tomko, Kourosh Khoshelham

Abstract:Accurate estimation of forest biomass is crucial for monitoring carbon sequestration and informing climate change mitigation strategies. Existing methods often rely on allometric models, which estimate individual tree biomass by relating it to measurable biophysical parameters, e.g., trunk diameter and height. This indirect approach is limited in accuracy due to measurement uncertainties and the inherently approximate nature of allometric equations, which may not fully account for the variability in tree characteristics and forest conditions. This study proposes a direct approach that leverages synthetic point cloud data to train a deep regression network, which is then applied to real point clouds for plot-level wood volume and aboveground biomass (AGB) estimation. We created synthetic 3D forest plots with ground truth volume, which were then converted into point cloud data using a lidar simulator. These point clouds were subsequently used to train deep regression networks based on PointNet, PointNet++, DGCNN, and PointConv. When applied to synthetic data, the deep regression networks achieved mean absolute percentage error (MAPE) values ranging from 1.69% to 8.11%. The trained networks were then applied to real lidar data to estimate volume and AGB. When compared against field measurements, our direct approach showed discrepancies of 2% to 20%. In contrast, indirect approaches based on individual tree segmentation followed by allometric conversion, as well as FullCAM, exhibited substantially large underestimation, with discrepancies ranging from 27% to 85%. Our results highlight the potential of integrating synthetic data with deep learning for efficient and scalable forest carbon estimation at plot level.

Via

Access Paper or Ask Questions

Relative Energy Learning for LiDAR Out-of-Distribution Detection

Nov 11, 2025

Zizhao Li, Zhengkang Xiang, Jiayang Ao, Joseph West, Kourosh Khoshelham

Figure 1 for Relative Energy Learning for LiDAR Out-of-Distribution Detection

Figure 2 for Relative Energy Learning for LiDAR Out-of-Distribution Detection

Figure 3 for Relative Energy Learning for LiDAR Out-of-Distribution Detection

Figure 4 for Relative Energy Learning for LiDAR Out-of-Distribution Detection

Abstract:Out-of-distribution (OOD) detection is a critical requirement for reliable autonomous driving, where safety depends on recognizing road obstacles and unexpected objects beyond the training distribution. Despite extensive research on OOD detection in 2D images, direct transfer to 3D LiDAR point clouds has been proven ineffective. Current LiDAR OOD methods struggle to distinguish rare anomalies from common classes, leading to high false-positive rates and overconfident errors in safety-critical settings. We propose Relative Energy Learning (REL), a simple yet effective framework for OOD detection in LiDAR point clouds. REL leverages the energy gap between positive (in-distribution) and negative logits as a relative scoring function, mitigating calibration issues in raw energy values and improving robustness across various scenes. To address the absence of OOD samples during training, we propose a lightweight data synthesis strategy called Point Raise, which perturbs existing point clouds to generate auxiliary anomalies without altering the inlier semantics. Evaluated on SemanticKITTI and the Spotting the Unexpected (STU) benchmark, REL consistently outperforms existing methods by a large margin. Our results highlight that modeling relative energy, combined with simple synthetic outliers, provides a principled and scalable solution for reliable OOD detection in open-world autonomous driving.

* The code and checkpoints will be released after paper acceptance

Via

Access Paper or Ask Questions

Surfel-based 3D Registration with Equivariant SE(3) Features

Aug 28, 2025

Xueyang Kang, Hang Zhao, Kourosh Khoshelham, Patrick Vandewalle

Abstract:Point cloud registration is crucial for ensuring 3D alignment consistency of multiple local point clouds in 3D reconstruction for remote sensing or digital heritage. While various point cloud-based registration methods exist, both non-learning and learning-based, they ignore point orientations and point uncertainties, making the model susceptible to noisy input and aggressive rotations of the input point cloud like orthogonal transformation; thus, it necessitates extensive training point clouds with transformation augmentations. To address these issues, we propose a novel surfel-based pose learning regression approach. Our method can initialize surfels from Lidar point cloud using virtual perspective camera parameters, and learns explicit $\mathbf{SE(3)}$ equivariant features, including both position and rotation through $\mathbf{SE(3)}$ equivariant convolutional kernels to predict relative transformation between source and target scans. The model comprises an equivariant convolutional encoder, a cross-attention mechanism for similarity computation, a fully-connected decoder, and a non-linear Huber loss. Experimental results on indoor and outdoor datasets demonstrate our model superiority and robust performance on real point-cloud scans compared to state-of-the-art methods.

* Published on 2025 IEEE International Geoscience and Remote Sensing Symposium
* 5 pages, 4 figures

Via

Access Paper or Ask Questions

From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Dec 01, 2024

Zizhao Li, Zhengkang Xiang, Joseph West, Kourosh Khoshelham

Figure 1 for From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Figure 2 for From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Figure 3 for From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Figure 4 for From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Abstract:Traditional object detection methods operate under the closed-set assumption, where models can only detect a fixed number of objects predefined in the training set. Recent works on open vocabulary object detection (OVD) enable the detection of objects defined by an unbounded vocabulary, which reduces the cost of training models for specific tasks. However, OVD heavily relies on accurate prompts provided by an ''oracle'', which limits their use in critical applications such as driving scene perception. OVD models tend to misclassify near-out-of-distribution (NOOD) objects that have similar semantics to known classes, and ignore far-out-of-distribution (FOOD) objects. To address theses limitations, we propose a framework that enables OVD models to operate in open world settings, by identifying and incrementally learning novel objects. To detect FOOD objects, we propose Open World Embedding Learning (OWEL) and introduce the concept of Pseudo Unknown Embedding which infers the location of unknown classes in a continuous semantic space based on the information of known classes. We also propose Multi-Scale Contrastive Anchor Learning (MSCAL), which enables the identification of misclassified unknown objects by promoting the intra-class consistency of object embeddings at different scales. The proposed method achieves state-of-the-art performance in common open world object detection and autonomous driving benchmarks.

Via

Access Paper or Ask Questions

Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration

Oct 08, 2024

Xueyang Kang, Zhaoliang Luan, Kourosh Khoshelham, Bing Wang

Abstract:Point cloud registration is a foundational task for 3D alignment and reconstruction applications. While both traditional and learning-based registration approaches have succeeded, leveraging the intrinsic symmetry of point cloud data, including rotation equivariance, has received insufficient attention. This prohibits the model from learning effectively, resulting in a requirement for more training data and increased model complexity. To address these challenges, we propose a graph neural network model embedded with a local Spherical Euclidean 3D equivariance property through SE(3) message passing based propagation. Our model is composed mainly of a descriptor module, equivariant graph layers, match similarity, and the final regression layers. Such modular design enables us to utilize sparsely sampled input points and initialize the descriptor by self-trained or pre-trained geometric feature descriptors easily. Experiments conducted on the 3DMatch and KITTI datasets exhibit the compelling and robust performance of our model compared to state-of-the-art approaches, while the model complexity remains relatively low at the same time.

* 18 main body pages, and 9 pages for supplementary part

Via

Access Paper or Ask Questions

LMSeg: A deep graph message-passing network for efficient and accurate semantic segmentation of large-scale 3D landscape meshes

Jul 05, 2024

Zexian Huang, Kourosh Khoshelham, Gunditj Mirring Traditional Owners Corporation, Martin Tomko

Figure 1 for LMSeg: A deep graph message-passing network for efficient and accurate semantic segmentation of large-scale 3D landscape meshes

Figure 2 for LMSeg: A deep graph message-passing network for efficient and accurate semantic segmentation of large-scale 3D landscape meshes

Figure 3 for LMSeg: A deep graph message-passing network for efficient and accurate semantic segmentation of large-scale 3D landscape meshes

Figure 4 for LMSeg: A deep graph message-passing network for efficient and accurate semantic segmentation of large-scale 3D landscape meshes

Abstract:Semantic segmentation of large-scale 3D landscape meshes is pivotal for various geospatial applications, including spatial analysis, automatic mapping and localization of target objects, and urban planning and development. This requires an efficient and accurate 3D perception system to understand and analyze real-world environments. However, traditional mesh segmentation methods face challenges in accurately segmenting small objects and maintaining computational efficiency due to the complexity and large size of 3D landscape mesh datasets. This paper presents an end-to-end deep graph message-passing network, LMSeg, designed to efficiently and accurately perform semantic segmentation on large-scale 3D landscape meshes. The proposed approach takes the barycentric dual graph of meshes as inputs and applies deep message-passing neural networks to hierarchically capture the geometric and spatial features from the barycentric graph structures and learn intricate semantic information from textured meshes. The hierarchical and local pooling of the barycentric graph, along with the effective geometry aggregation modules of LMSeg, enable fast inference and accurate segmentation of small-sized and irregular mesh objects in various complex landscapes. Extensive experiments on two benchmark datasets (natural and urban landscapes) demonstrate that LMSeg significantly outperforms existing learning-based segmentation methods in terms of object segmentation accuracy and computational efficiency. Furthermore, our method exhibits strong generalization capabilities across diverse landscapes and demonstrates robust resilience against varying mesh densities and landscape topologies.

Via

Access Paper or Ask Questions

Learning Geometric Invariant Features for Classification of Vector Polygons with Graph Message-passing Neural Network

Jul 05, 2024

Zexian Huang, Kourosh Khoshelham, Martin Tomko

Figure 1 for Learning Geometric Invariant Features for Classification of Vector Polygons with Graph Message-passing Neural Network

Figure 2 for Learning Geometric Invariant Features for Classification of Vector Polygons with Graph Message-passing Neural Network

Figure 3 for Learning Geometric Invariant Features for Classification of Vector Polygons with Graph Message-passing Neural Network

Figure 4 for Learning Geometric Invariant Features for Classification of Vector Polygons with Graph Message-passing Neural Network

Abstract:Geometric shape classification of vector polygons remains a non-trivial learning task in spatial analysis. Previous studies mainly focus on devising deep learning approaches for representation learning of rasterized vector polygons, whereas the study of discrete representations of polygons and subsequent deep learning approaches have not been fully investigated. In this study, we investigate a graph representation of vector polygons and propose a novel graph message-passing neural network (PolyMP) to learn the geometric-invariant features for shape classification of polygons. Through extensive experiments, we show that the graph representation of polygons combined with a permutation-invariant graph message-passing neural network achieves highly robust performances on benchmark datasets (i.e., synthetic glyph and real-world building footprint datasets) as compared to baseline methods. We demonstrate that the proposed graph-based PolyMP network enables the learning of expressive geometric features invariant to geometric transformations of polygons (i.e., translation, rotation, scaling and shearing) and is robust to trivial vertex removals of polygons. We further show the strong generalizability of PolyMP, which enables generalizing the learned geometric features from the synthetic glyph polygons to the real-world building footprints.

Via

Access Paper or Ask Questions