The vast progress in synthetic image synthesis enables the generation of facial images in high resolution and photorealism. In biometric applications, the main motivation for using synthetic data is to solve the shortage of publicly-available biometric data while reducing privacy risks when processing such sensitive information. These advantages are exploited in this work by simulating human face ageing with recent face age modification algorithms to generate mated samples, thereby studying the impact of ageing on the performance of an open-source biometric recognition system. Further, a real dataset is used to evaluate the effects of short-term ageing, comparing the biometric performance to the synthetic domain. The main findings indicate that short-term ageing in the range of 1-5 years has only minor effects on the general recognition performance. However, the correct verification of mated faces with long-term age differences beyond 20 years poses still a significant challenge and requires further investigation.
This paper presents a summary of the Competition on Face Morphing Attack Detection Based on Privacy-aware Synthetic Training Data (SYN-MAD) held at the 2022 International Joint Conference on Biometrics (IJCB 2022). The competition attracted a total of 12 participating teams, both from academia and industry and present in 11 different countries. In the end, seven valid submissions were submitted by the participating teams and evaluated by the organizers. The competition was held to present and attract solutions that deal with detecting face morphing attacks while protecting people's privacy for ethical and legal reasons. To ensure this, the training data was limited to synthetic data provided by the organizers. The submitted solutions presented innovations that led to outperforming the considered baseline in many experimental settings. The evaluation benchmark is now available at: https://github.com/marcohuber/SYN-MAD-2022.
In this paper we present a novel self-supervised method to anticipate the depth estimate for a future, unobserved real-world urban scene. This work is the first to explore self-supervised learning for estimation of monocular depth of future unobserved frames of a video. Existing works rely on a large number of annotated samples to generate the probabilistic prediction of depth for unseen frames. However, this makes it unrealistic due to its requirement for large amount of annotated depth samples of video. In addition, the probabilistic nature of the case, where one past can have multiple future outcomes often leads to incorrect depth estimates. Unlike previous methods, we model the depth estimation of the unobserved frame as a view-synthesis problem, which treats the depth estimate of the unseen video frame as an auxiliary task while synthesizing back the views using learned pose. This approach is not only cost effective - we do not use any ground truth depth for training (hence practical) but also deterministic (a sequence of past frames map to an immediate future). To address this task we first develop a novel depth forecasting network DeFNet which estimates depth of unobserved future by forecasting latent features. Second, we develop a channel-attention based pose estimation network that estimates the pose of the unobserved frame. Using this learned pose, estimated depth map is reconstructed back into the image domain, thus forming a self-supervised solution. Our proposed approach shows significant improvements in Abs Rel metric compared to state-of-the-art alternatives on both short and mid-term forecasting setting, benchmarked on KITTI and Cityscapes. Code is available at https://github.com/sauradip/depthForecasting
While several works have studied the vulnerability of automated FRS and have proposed morphing attack detection (MAD) methods, very few have focused on studying the human ability to detect morphing attacks. The examiner/observer's face morph detection ability is based on their observation, domain knowledge, experience, and familiarity with the problem, and no works report the detailed findings from observers who check identity documents as a part of their everyday professional life. This work creates a new benchmark database of realistic morphing attacks from 48 unique subjects leading to 400 morphed images presented to the observers in a Differential-MAD (D-MAD) setting. Unlike the existing databases, the newly created morphed image database has been created with careful considerations to age, gender and ethnicity to create realistic morph attacks. Further, unlike the previous works, we also capture ten images from Automated Border Control (ABC) gates to mimic the realistic D-MAD setting leading to 400 probe images in border crossing scenarios. The newly created dataset is further used to study the ability of human observers' ability to detect morphed images. In addition, a new dataset of 180 morphed images is also created using the FRGCv2 dataset under the Single Image-MAD (S-MAD) setting. Further, to benchmark the human ability in detecting morphs, a new evaluation platform is created to conduct S-MAD and D-MAD analysis. The benchmark study employs 469 observers for D-MAD and 410 observers for S-MAD who are primarily governmental employees from more than 40 countries. The analysis provides interesting insights and points to expert observers' missing competence and failure to detect a considerable amount of morphing attacks. Human observers tend to detect morphed images to a lower accuracy as compared to the automated MAD algorithms evaluated in this work.
Face Recognition systems (FRS) have been found vulnerable to morphing attacks, where the morphed face image is generated by blending the face images from contributory data subjects. This work presents a novel direction towards generating face morphing attacks in 3D. To this extent, we have introduced a novel approach based on blending the 3D face point clouds corresponding to the contributory data subjects. The proposed method will generate the 3D face morphing by projecting the input 3D face point clouds to depth-maps \& 2D color images followed by the image blending and wrapping operations performed independently on the color images and depth maps. We then back-project the 2D morphing color-map and the depth-map to the point cloud using the canonical (fixed) view. Given that the generated 3D face morphing models will result in the holes due to a single canonical view, we have proposed a new algorithm for hole filling that will result in a high-quality 3D face morphing model. Extensive experiments are carried out on the newly generated 3D face dataset comprised of 675 3D scans corresponding to 41 unique data subjects. Experiments are performed to benchmark the vulnerability of automatic 2D and 3D FRS and human observer analysis. We also present the quantitative assessment of the quality of the generated 3D face morphing models using eight different quality metrics. Finally, we have proposed three different 3D face Morphing Attack Detection (3D-MAD) algorithms to benchmark the performance of the 3D MAD algorithms.
Enabling highly secure applications (such as border crossing) with face recognition requires extensive biometric performance tests through large scale data. However, using real face images raises concerns about privacy as the laws do not allow the images to be used for other purposes than originally intended. Using representative and subsets of face data can also lead to unwanted demographic biases and cause an imbalance in datasets. One possible solution to overcome these issues is to replace real face images with synthetically generated samples. While generating synthetic images has benefited from recent advancements in computer vision, generating multiple samples of the same synthetic identity resembling real-world variations is still unaddressed, i.e., mated samples. This work proposes a non-deterministic method for generating mated face images by exploiting the well-structured latent space of StyleGAN. Mated samples are generated by manipulating latent vectors, and more precisely, we exploit Principal Component Analysis (PCA) to define semantically meaningful directions in the latent space and control the similarity between the original and the mated samples using a pre-trained face recognition system. We create a new dataset of synthetic face images (SymFace) consisting of 77,034 samples including 25,919 synthetic IDs. Through our analysis using well-established face image quality metrics, we demonstrate the differences in the biometric quality of synthetic samples mimicking characteristics of real biometric data. The analysis and results thereof indicate the use of synthetic samples created using the proposed approach as a viable alternative to replacing real biometric data.
Face morphing attacks can compromise Face Recognition System (FRS) by exploiting their vulnerability. Face Morphing Attack Detection (MAD) techniques have been developed in recent past to deter such attacks and mitigate risks from morphing attacks. MAD algorithms, as any other algorithms should treat the images of subjects from different ethnic origins in an equal manner and provide non-discriminatory results. While the promising MAD algorithms are tested for robustness, there is no study comprehensively bench-marking their behaviour against various ethnicities. In this paper, we study and present a comprehensive analysis of algorithmic fairness of the existing Single image-based Morph Attack Detection (S-MAD) algorithms. We attempt to better understand the influence of ethnic bias on MAD algorithms and to this extent, we study the performance of MAD algorithms on a newly created dataset consisting of four different ethnic groups. With Extensive experiments using six different S-MAD techniques, we first present benchmark of detection performance and then measure the quantitative value of the algorithmic fairness for each of them using Fairness Discrepancy Rate (FDR). The results indicate the lack of fairness on all six different S-MAD methods when trained and tested on different ethnic groups suggesting the need for reliable MAD approaches to mitigate the algorithmic bias.
The robustness and generalization ability of Presentation Attack Detection (PAD) methods is critical to ensure the security of Face Recognition Systems (FRSs). However, in the real scenario, Presentation Attacks (PAs) are various and hard to be collected. Existing PAD methods are highly dependent on the limited training set and cannot generalize well to unknown PAs. Unlike PAD task, other face-related tasks trained by huge amount of real faces (e.g. face recognition and attribute editing) can be effectively adopted into different application scenarios. Inspired by this, we propose to apply taskonomy (task taxonomy) from other face-related tasks to solve face PAD, so as to improve the generalization ability in detecting PAs. The proposed method, first introduces task specific features from other face-related tasks, then, we design a Cross-Modal Adapter using a Graph Attention Network (GAT) to re-map such features to adapt to PAD task. Finally, face PAD is achieved by using the hierarchical features from a CNN-based PA detector and the re-mapped features. The experimental results show that the proposed method can achieve significant improvements in the complicated and hybrid datasets, when compared with the state-of-the-art methods. In particular, when trained using OULU-NPU, CASIA-FASD, and Idiap Replay-Attack, we obtain HTER (Half Total Error Rate) of 5.48% in MSU-MFSD, outperforming the baseline by 7.39%. Code will be made publicly available.
An iris presentation attack detection (IPAD) is essential for securing personal identity is widely used iris recognition systems. However, the existing IPAD algorithms do not generalize well to unseen and cross-domain scenarios because of capture in unconstrained environments and high visual correlation amongst bonafide and attack samples. These similarities in intricate textural and morphological patterns of iris ocular images contribute further to performance degradation. To alleviate these shortcomings, this paper proposes DFCANet: Dense Feature Calibration and Attention Guided Network which calibrates the locally spread iris patterns with the globally located ones. Uplifting advantages from feature calibration convolution and residual learning, DFCANet generates domain-specific iris feature representations. Since some channels in the calibrated feature maps contain more prominent information, we capitalize discriminative feature learning across the channels through the channel attention mechanism. In order to intensify the challenge for our proposed model, we make DFCANet operate over nonsegmented and non-normalized ocular iris images. Extensive experimentation conducted over challenging cross-domain and intra-domain scenarios highlights consistent outperforming results. Compared to state-of-the-art methods, DFCANet achieves significant gains in performance for the benchmark IIITD CLI, IIIT CSD and NDCLD13 databases respectively. Further, a novel incremental learning-based methodology has been introduced so as to overcome disentangled iris-data characteristics and data scarcity. This paper also pursues the challenging scenario that considers soft-lens under the attack category with evaluation performed under various cross-domain protocols. The code will be made publicly available.