Abstract:Monocular re-localization plays a crucial role in enabling intelligent agents to achieve human-like perception. However, traditional methods rely on dense maps, which face scalability limitations and privacy risks. OpenStreetMap (OSM), as a lightweight map that protects privacy, offers semantic and geometric information with global scalability. Nonetheless, there are still challenges in using OSM for localization: the inherent cross-modal discrepancies between natural images and OSM, as well as the high computational cost of global map-based localization. In this paper, we propose a hierarchical search framework with semantic alignment for localization in OSM. First, the semantic awareness capability of DINO-ViT is utilised to deconstruct visual elements to establish semantic relationships with OSM. Second, a coarse-to-fine search paradigm is designed to replace global dense matching, enabling efficient progressive refinement. Extensive experiments demonstrate that our method significantly improves both localization accuracy and speed. When trained on a single dataset, the 3° orientation recall of our method even outperforms the 5° recall of state-of-the-art methods.




Abstract:Deep learning-based palmprint recognition algorithms have shown great potential. Most of them are mainly focused on identifying samples from the same dataset. However, they may be not suitable for a more convenient case that the images for training and test are from different datasets, such as collected by embedded terminals and smartphones. Therefore, we propose a novel Joint Pixel and Feature Alignment (JPFA) framework for such cross-dataset palmprint recognition scenarios. Two stage-alignment is applied to obtain adaptive features in source and target datasets. 1) Deep style transfer model is adopted to convert source images into fake images to reduce the dataset gaps and perform data augmentation on pixel level. 2) A new deep domain adaptation model is proposed to extract adaptive features by aligning the dataset-specific distributions of target-source and target-fake pairs on feature level. Adequate experiments are conducted on several benchmarks including constrained and unconstrained palmprint databases. The results demonstrate that our JPFA outperforms other models to achieve the state-of-the-arts. Compared with baseline, the accuracy of cross-dataset identification is improved by up to 28.10% and the Equal Error Rate (EER) of cross-dataset verification is reduced by up to 4.69%. To make our results reproducible, the codes are publicly available at http://gr.xjtu.edu.cn/web/bell/resource.




Abstract:Deep palmprint recognition has become an emerging issue with great potential for personal authentication on handheld and wearable consumer devices. Previous studies of palmprint recognition are mainly based on constrained datasets collected by dedicated devices in controlled environments, which has to reduce the flexibility and convenience. In addition, general deep palmprint recognition algorithms are often too heavy to meet the real-time requirements of embedded system. In this paper, a new palmprint benchmark is established, which consists of more than 20,000 images collected by 5 brands of smart phones in an unconstrained manner. Each image has been manually labeled with 14 key points for region of interest (ROI) extraction. Further, the approach called Deep Distillation Hashing (DDH) is proposed as benchmark for efficient deep palmprint recognition. Palmprint images are converted to binary codes to improve the efficiency of feature matching. Derived from knowledge distillation, novel distillation loss functions are constructed to compress deep model to further improve the efficiency of feature extraction on light network. Comprehensive experiments are conducted on both constrained and unconstrained palmprint databases. Using DDH, the accuracy of palmprint identification can be increased by up to 11.37%, and the Equal Error Rate (EER) of palmprint verification can be reduced by up to 3.11%. The results indicate the feasibility of our database, and DDH can outperform other baselines to achieve the state-of-the-art performance. The collected dataset and related source codes are publicly available at http://gr.xjtu.edu.cn/web/bell/resource.