Abstract:Nowadays, one of the main challenges in presentation attack detection (PAD) on ID cards is obtaining generalisation capabilities for a diversity of countries that are issuing ID cards. Most PAD systems are trained on one, two, or three ID documents because of privacy protection concerns. As a result, they do not obtain competitive results for commercial purposes when tested in an unknown new ID card country. In this scenario, Foundation Models (FM) trained on huge datasets can help to improve generalisation capabilities. This work intends to improve and benchmark the capabilities of FM and how to use them to adapt the generalisation on PAD of ID Documents. Different test protocols were used, considering zero-shot and fine-tuning and two different ID card datasets. One private dataset based on Chilean IDs and one open-set based on three ID countries: Finland, Spain, and Slovakia. Our findings indicate that bona fide images are the key to generalisation.
Abstract:The demand for Presentation Attack Detection (PAD) to identify fraudulent ID documents in remote verification systems has significantly risen in recent years. This increase is driven by several factors, including the rise of remote work, online purchasing, migration, and advancements in synthetic images. Additionally, we have noticed a surge in the number of attacks aimed at the enrolment process. Training a PAD to detect fake ID documents is very challenging because of the limited number of ID documents available due to privacy concerns. This work proposes a new passport dataset generated from a hybrid method that combines synthetic data and open-access information using the ICAO requirement to obtain realistic training and testing images.
Abstract:Face Recognition Systems (FRS) are increasingly vulnerable to face-morphing attacks, prompting the development of Morphing Attack Detection (MAD) algorithms. However, a key challenge in MAD lies in its limited generalizability to unseen data and its lack of explainability-critical for practical application environments such as enrolment stations and automated border control systems. Recognizing that most existing MAD algorithms rely on supervised learning paradigms, this work explores a novel approach to MAD using zero-shot learning leveraged on Large Language Models (LLMs). We propose two types of zero-shot MAD algorithms: one leveraging general vision models and the other utilizing multimodal LLMs. For general vision models, we address the MAD task by computing the mean support embedding of an independent support set without using morphed images. For the LLM-based approach, we employ the state-of-the-art GPT-4 Turbo API with carefully crafted prompts. To evaluate the feasibility of zero-shot MAD and the effectiveness of the proposed methods, we constructed a print-scan morph dataset featuring various unseen morphing algorithms, simulating challenging real-world application scenarios. Experimental results demonstrated notable detection accuracy, validating the applicability of zero-shot learning for MAD tasks. Additionally, our investigation into LLM-based MAD revealed that multimodal LLMs, such as ChatGPT, exhibit remarkable generalizability to untrained MAD tasks. Furthermore, they possess a unique ability to provide explanations and guidance, which can enhance transparency and usability for end-users in practical applications.
Abstract:Face image quality assessment (FIQA) algorithms are being integrated into online identity management applications. These applications allow users to upload a face image as part of their document issuance process, where the image is then run through a quality assessment process to make sure it meets the quality and compliance requirements. Concerns about demographic bias have been raised about biometric systems, given the societal implications this may cause. It is therefore important that demographic variability in FIQA algorithms is assessed such that mitigation measures can be created. In this work, we study the demographic variability of all face image quality measures included in the ISO/IEC 29794-5 international standard across three demographic variables: age, gender, and skin tone. The results are rather promising and show no clear bias toward any specific demographic group for most measures. Only two quality measures are found to have considerable variations in their outcomes for different groups on the skin tone variable.
Abstract:Fair operational systems are crucial in gaining and maintaining society's trust in face recognition systems (FRS). FRS start with capturing an image and assessing its quality before using it further for enrollment or verification. Fair Face Image Quality Assessment (FIQA) schemes therefore become equally important in the context of fair FRS. This work examines the sclera as a quality assessment region for obtaining a fair FIQA. The sclera region is agnostic to demographic variations and skin colour for assessing the quality of a face image. We analyze three skin tone related ISO/IEC face image quality assessment measures and assess the sclera region as an alternative area for assessing FIQ. Our analysis of the face dataset of individuals from different demographic groups representing different skin tones indicates sclera as an alternative to measure dynamic range, over- and under-exposure of face using sclera region alone. The sclera region being agnostic to skin tone, i.e., demographic factors, provides equal utility as a fair FIQA as shown by our Error-vs-Discard Characteristic (EDC) curve analysis.
Abstract:Acquiring face images of sufficiently high quality is important for online ID and travel document issuance applications using face recognition systems (FRS). Low-quality, manipulated (intentionally or unintentionally), or distorted images degrade the FRS performance and facilitate documents' misuse. Securing quality for enrolment images, especially in the unsupervised self-enrolment scenario via a smartphone, becomes important to assure FRS performance. In this work, we focus on the less studied area of radial distortion (a.k.a., the fish-eye effect) in face images and its impact on FRS performance. We introduce an effective radial distortion detection model that can detect and flag radial distortion in the enrolment scenario. We formalize the detection model as a face image quality assessment (FIQA) algorithm and provide a careful inspection of the effect of radial distortion on FRS performance. Evaluation results show excellent detection results for the proposed models, and the study on the impact on FRS uncovers valuable insights into how to best use these models in operational systems.
Abstract:A face image is a mandatory part of ID and travel documents. Obtaining high-quality face images when issuing such documents is crucial for both human examiners and automated face recognition systems. In several international standards, face image quality requirements are intricate and defined in detail. Identifying and understanding non-compliance or defects in the submitted face images is crucial for both issuing authorities and applicants. In this work, we introduce FaceOracle, an LLM-powered AI assistant that helps its users analyze a face image in a natural conversational manner using standard compliant algorithms. Leveraging the power of LLMs, users can get explanations of various face image quality concepts as well as interpret the outcome of face image quality assessment (FIQA) algorithms. We implement a proof-of-concept that demonstrates how experts at an issuing authority could integrate FaceOracle into their workflow to analyze, understand, and communicate their decisions more efficiently, resulting in enhanced productivity.
Abstract:Foundation models are becoming increasingly popular due to their strong generalization capabilities resulting from being trained on huge datasets. These generalization capabilities are attractive in areas such as NIR Iris Presentation Attack Detection (PAD), in which databases are limited in the number of subjects and diversity of attack instruments, and there is no correspondence between the bona fide and attack images because, most of the time, they do not belong to the same subjects. This work explores an iris PAD approach based on two foundation models, DinoV2 and VisualOpenClip. The results show that fine-tuning prediction with a small neural network as head overpasses the state-of-the-art performance based on deep learning approaches. However, systems trained from scratch have still reached better results if bona fide and attack images are available.
Abstract:Face morphing attacks pose a severe security threat to face recognition systems, enabling the morphed face image to be verified against multiple identities. To detect such manipulated images, the development of new face morphing methods becomes essential to increase the diversity of training datasets used for face morph detection. In this study, we present a representation-level face morphing approach, namely LADIMO, that performs morphing on two face recognition embeddings. Specifically, we train a Latent Diffusion Model to invert a biometric template - thus reconstructing the face image from an FRS latent representation. Our subsequent vulnerability analysis demonstrates the high morph attack potential in comparison to MIPGAN-II, an established GAN-based face morphing approach. Finally, we exploit the stochastic LADMIO model design in combination with our identity conditioning mechanism to create unlimited morphing attacks from a single face morph image pair. We show that each face morph variant has an individual attack success rate, enabling us to maximize the morph attack potential by applying a simple re-sampling strategy. Code and pre-trained models available here: https://github.com/dasec/LADIMO
Abstract:Face morphing attack detection (MAD) algorithms have become essential to overcome the vulnerability of face recognition systems. To solve the lack of large-scale and public-available datasets due to privacy concerns and restrictions, in this work we propose a new method to generate a synthetic face morphing dataset with 2450 identities and more than 100k morphs. The proposed synthetic face morphing dataset is unique for its high-quality samples, different types of morphing algorithms, and the generalization for both single and differential morphing attack detection algorithms. For experiments, we apply face image quality assessment and vulnerability analysis to evaluate the proposed synthetic face morphing dataset from the perspective of biometric sample quality and morphing attack potential on face recognition systems. The results are benchmarked with an existing SOTA synthetic dataset and a representative non-synthetic and indicate improvement compared with the SOTA. Additionally, we design different protocols and study the applicability of using the proposed synthetic dataset on training morphing attack detection algorithms.