Recent years have witnessed the surge of learned representations that directly build upon point clouds. Though becoming increasingly expressive, most existing representations still struggle to generate ordered point sets. Inspired by spherical multi-view scanners, we propose a novel sampling model called Spotlights to represent a 3D shape as a compact 1D array of depth values. It simulates the configuration of cameras evenly distributed on a sphere, where each virtual camera casts light rays from its principal point through sample points on a small concentric spherical cap to probe for the possible intersections with the object surrounded by the sphere. The structured point cloud is hence given implicitly as a function of depths. We provide a detailed geometric analysis of this new sampling scheme and prove its effectiveness in the context of the point cloud completion task. Experimental results on both synthetic and real data demonstrate that our method achieves competitive accuracy and consistency while having a significantly reduced computational cost. Furthermore, we show superior performance on the downstream point cloud registration task over state-of-the-art completion methods.
Camera relocalization is the key component of simultaneous localization and mapping (SLAM) systems. This paper proposes a learning-based approach, named Sparse Spatial Scene Embedding with Graph Neural Networks (S3E-GNN), as an end-to-end framework for efficient and robust camera relocalization. S3E-GNN consists of two modules. In the encoding module, a trained S3E network encodes RGB images into embedding codes to implicitly represent spatial and semantic embedding code. With embedding codes and the associated poses obtained from a SLAM system, each image is represented as a graph node in a pose graph. In the GNN query module, the pose graph is transformed to form a embedding-aggregated reference graph for camera relocalization. We collect various scene datasets in the challenging environments to perform experiments. Our results demonstrate that S3E-GNN method outperforms the traditional Bag-of-words (BoW) for camera relocalization due to learning-based embedding and GNN powered scene matching mechanism.