Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:facial recognition

What is facial recognition? Facial recognition is an AI-based technique for identifying or confirming an individual's identity using their face. It maps facial features from an image or video and then compares the information with a collection of known faces to find a match.

SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

Mar 25, 2025

Pei-Kai Huang, Jun-Xiong Chong, Cheng-Hsuan Chiang, Tzu-Hsien Chen, Tyng-Luh Liu, Chiou-Ting Hsu

Figure 1 for SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

Figure 2 for SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

Figure 3 for SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

Figure 4 for SLIP: Spoof-Aware One-Class Face Anti-Spoofing with Language Image Pretraining

Abstract:Face anti-spoofing (FAS) plays a pivotal role in ensuring the security and reliability of face recognition systems. With advancements in vision-language pretrained (VLP) models, recent two-class FAS techniques have leveraged the advantages of using VLP guidance, while this potential remains unexplored in one-class FAS methods. The one-class FAS focuses on learning intrinsic liveness features solely from live training images to differentiate between live and spoof faces. However, the lack of spoof training data can lead one-class FAS models to inadvertently incorporate domain information irrelevant to the live/spoof distinction (e.g., facial content), causing performance degradation when tested with a new application domain. To address this issue, we propose a novel framework called Spoof-aware one-class face anti-spoofing with Language Image Pretraining (SLIP). Given that live faces should ideally not be obscured by any spoof-attack-related objects (e.g., paper, or masks) and are assumed to yield zero spoof cue maps, we first propose an effective language-guided spoof cue map estimation to enhance one-class FAS models by simulating whether the underlying faces are covered by attack-related objects and generating corresponding nonzero spoof cue maps. Next, we introduce a novel prompt-driven liveness feature disentanglement to alleviate live/spoof-irrelative domain variations by disentangling live/spoof-relevant and domain-dependent information. Finally, we design an effective augmentation strategy by fusing latent features from live images and spoof prompts to generate spoof-like image features and thus diversify latent spoof features to facilitate the learning of one-class FAS. Our extensive experiments and ablation studies support that SLIP consistently outperforms previous one-class FAS methods.

* Accepted by AAAI 2025

Via

Access Paper or Ask Questions

Frequency Matters: Explaining Biases of Face Recognition in the Frequency Domain

Jan 28, 2025

Marco Huber, Fadi Boutros, Naser Damer

Figure 1 for Frequency Matters: Explaining Biases of Face Recognition in the Frequency Domain

Figure 2 for Frequency Matters: Explaining Biases of Face Recognition in the Frequency Domain

Figure 3 for Frequency Matters: Explaining Biases of Face Recognition in the Frequency Domain

Figure 4 for Frequency Matters: Explaining Biases of Face Recognition in the Frequency Domain

Abstract:Face recognition (FR) models are vulnerable to performance variations across demographic groups. The causes for these performance differences are unclear due to the highly complex deep learning-based structure of face recognition models. Several works aimed at exploring possible roots of gender and ethnicity bias, identifying semantic reasons such as hairstyle, make-up, or facial hair as possible sources. Motivated by recent discoveries of the importance of frequency patterns in convolutional neural networks, we explain bias in face recognition using state-of-the-art frequency-based explanations. Our extensive results show that different frequencies are important to FR models depending on the ethnicity of the samples.

* Accepted at xAI4Biometrics at ECCV 2024

Via

Access Paper or Ask Questions

Video-DPRP: A Differentially Private Approach for Visual Privacy-Preserving Video Human Activity Recognition

Mar 03, 2025

Allassan Tchangmena A Nken, Susan Mckeever, Peter Corcoran, Ihsan Ullah

Abstract:Considerable effort has been made in privacy-preserving video human activity recognition (HAR). Two primary approaches to ensure privacy preservation in Video HAR are differential privacy (DP) and visual privacy. Techniques enforcing DP during training provide strong theoretical privacy guarantees but offer limited capabilities for visual privacy assessment. Conversely methods, such as low-resolution transformations, data obfuscation and adversarial networks, emphasize visual privacy but lack clear theoretical privacy assurances. In this work, we focus on two main objectives: (1) leveraging DP properties to develop a model-free approach for visual privacy in videos and (2) evaluating our proposed technique using both differential privacy and visual privacy assessments on HAR tasks. To achieve goal (1), we introduce Video-DPRP: a Video-sample-wise Differentially Private Random Projection framework for privacy-preserved video reconstruction for HAR. By using random projections, noise matrices and right singular vectors derived from the singular value decomposition of videos, Video-DPRP reconstructs DP videos using privacy parameters ($\epsilon,\delta$) while enabling visual privacy assessment. For goal (2), using UCF101 and HMDB51 datasets, we compare Video-DPRP's performance on activity recognition with traditional DP methods, and state-of-the-art (SOTA) visual privacy-preserving techniques. Additionally, we assess its effectiveness in preserving privacy-related attributes such as facial features, gender, and skin color, using the PA-HMDB and VISPR datasets. Video-DPRP combines privacy-preservation from both a DP and visual privacy perspective unlike SOTA methods that typically address only one of these aspects.

Via

Access Paper or Ask Questions

CG-MER: A Card Game-based Multimodal dataset for Emotion Recognition

Jan 14, 2025

Nessrine Farhat, Amine Bohi, Leila Ben Letaifa, Rim Slama

Abstract:The field of affective computing has seen significant advancements in exploring the relationship between emotions and emerging technologies. This paper presents a novel and valuable contribution to this field with the introduction of a comprehensive French multimodal dataset designed specifically for emotion recognition. The dataset encompasses three primary modalities: facial expressions, speech, and gestures, providing a holistic perspective on emotions. Moreover, the dataset has the potential to incorporate additional modalities, such as Natural Language Processing (NLP) to expand the scope of emotion recognition research. The dataset was curated through engaging participants in card game sessions, where they were prompted to express a range of emotions while responding to diverse questions. The study included 10 sessions with 20 participants (9 females and 11 males). The dataset serves as a valuable resource for furthering research in emotion recognition and provides an avenue for exploring the intricate connections between human emotions and digital technologies.

* 8 pages, 2 figures and 4 tables. Sixteenth International Conference on Machine Vision (ICMV 2023), Yerevan, Armenia

Via

Access Paper or Ask Questions

On the "Illusion" of Gender Bias in Face Recognition: Explaining the Fairness Issue Through Non-demographic Attributes

Jan 21, 2025

Paul Jonas Kurz, Haiyu Wu, Kevin W. Bowyer, Philipp Terhörst

Abstract:Face recognition systems (FRS) exhibit significant accuracy differences based on the user's gender. Since such a gender gap reduces the trustworthiness of FRS, more recent efforts have tried to find the causes. However, these studies make use of manually selected, correlated, and small-sized sets of facial features to support their claims. In this work, we analyse gender bias in face recognition by successfully extending the search domain to decorrelated combinations of 40 non-demographic facial characteristics. First, we propose a toolchain to effectively decorrelate and aggregate facial attributes to enable a less-biased gender analysis on large-scale data. Second, we introduce two new fairness metrics to measure fairness with and without context. Based on these grounds, we thirdly present a novel unsupervised algorithm able to reliably identify attribute combinations that lead to vanishing bias when used as filter predicates for balanced testing datasets. The experiments show that the gender gap vanishes when images of male and female subjects share specific attributes, clearly indicating that the issue is not a question of biology but of the social definition of appearance. These findings could reshape our understanding of fairness in face biometrics and provide insights into FRS, helping to address gender bias issues.

Via

Access Paper or Ask Questions

Pairwise Discernment of AffectNet Expressions with ArcFace

Dec 01, 2024

Dylan Waldner, Shyamal Mitra

Figure 1 for Pairwise Discernment of AffectNet Expressions with ArcFace

Figure 2 for Pairwise Discernment of AffectNet Expressions with ArcFace

Figure 3 for Pairwise Discernment of AffectNet Expressions with ArcFace

Figure 4 for Pairwise Discernment of AffectNet Expressions with ArcFace

Abstract:This study takes a preliminary step toward teaching computers to recognize human emotions through Facial Emotion Recognition (FER). Transfer learning is applied using ResNeXt, EfficientNet models, and an ArcFace model originally trained on the facial verification task, leveraging the AffectNet database, a collection of human face images annotated with corresponding emotions. The findings highlight the value of congruent domain transfer learning, the challenges posed by imbalanced datasets in learning facial emotion patterns, and the effectiveness of pairwise learning in addressing class imbalances to enhance model performance on the FER task.

Via

Access Paper or Ask Questions

Prior-based Objective Inference Mining Potential Uncertainty for Facial Expression Recognition

Nov 20, 2024

Hanwei Liu, Huiling Cai, Qingcheng Lin, Xuefeng Li, Hui Xiao

Abstract:Annotation ambiguity caused by the inherent subjectivity of visual judgment has always been a major challenge for Facial Expression Recognition (FER) tasks, particularly for largescale datasets from in-the-wild scenarios. A potential solution is the evaluation of relatively objective emotional distributions to help mitigate the ambiguity of subjective annotations. To this end, this paper proposes a novel Prior-based Objective Inference (POI) network. This network employs prior knowledge to derive a more objective and varied emotional distribution and tackles the issue of subjective annotation ambiguity through dynamic knowledge transfer. POI comprises two key networks: Firstly, the Prior Inference Network (PIN) utilizes the prior knowledge of AUs and emotions to capture intricate motion details. To reduce over-reliance on priors and facilitate objective emotional inference, PIN aggregates inferential knowledge from various key facial subregions, encouraging mutual learning. Secondly, the Target Recognition Network (TRN) integrates subjective emotion annotations and objective inference soft labels provided by the PIN, fostering an understanding of inherent facial expression diversity, thus resolving annotation ambiguity. Moreover, we introduce an uncertainty estimation module to quantify and balance facial expression confidence. This module enables a flexible approach to dealing with the uncertainties of subjective annotations. Extensive experiments show that POI exhibits competitive performance on both synthetic noisy datasets and multiple real-world datasets. All codes and training logs will be publicly available at https://github.com/liuhw01/POI.

Via

Access Paper or Ask Questions

CLIP Unreasonable Potential in Single-Shot Face Recognition

Nov 20, 2024

Nhan T. Luu

Figure 1 for CLIP Unreasonable Potential in Single-Shot Face Recognition

Figure 2 for CLIP Unreasonable Potential in Single-Shot Face Recognition

Figure 3 for CLIP Unreasonable Potential in Single-Shot Face Recognition

Abstract:Face recognition is a core task in computer vision designed to identify and authenticate individuals by analyzing facial patterns and features. This field intersects with artificial intelligence image processing and machine learning with applications in security authentication and personalization. Traditional approaches in facial recognition focus on capturing facial features like the eyes, nose and mouth and matching these against a database to verify identities. However challenges such as high false positive rates have persisted often due to the similarity among individuals facial features. Recently Contrastive Language Image Pretraining (CLIP) a model developed by OpenAI has shown promising advancements by linking natural language processing with vision tasks allowing it to generalize across modalities. Using CLIP's vision language correspondence and single-shot finetuning the model can achieve lower false positive rates upon deployment without the need of mass facial features extraction. This integration demonstrating CLIP's potential to address persistent issues in face recognition model performance without complicating our training paradigm.

Via

Access Paper or Ask Questions

Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors

Nov 19, 2024

Yuanyuan Liu, Lin Wei, Kejun Liu, Yibing Zhan, Zijing Chen, Zhe Chen, Shiguang Shan

Figure 1 for Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors

Figure 2 for Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors

Figure 3 for Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors

Figure 4 for Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors

Abstract:Emotion Recognition (ER) is the process of identifying human emotions from given data. Currently, the field heavily relies on facial expression recognition (FER) because facial expressions contain rich emotional cues. However, it is important to note that facial expressions may not always precisely reflect genuine emotions and FER-based results may yield misleading ER. To understand and bridge this gap between FER and ER, we introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition (EMER) dataset. Different from existing multimodal ER datasets, the EMER dataset employs a stimulus material-induced spontaneous emotion generation method to integrate non-invasive eye behavior data, like eye movements and eye fixation maps, with facial videos, aiming to obtain natural and accurate human emotions. Notably, for the first time, we provide annotations for both ER and FER in the EMER, enabling a comprehensive analysis to better illustrate the gap between both tasks. Furthermore, we specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER by efficiently identifying and bridging the emotion gap between the two.Specifically, our EMERT employs modality-adversarial feature decoupling and multi-task Transformer to augment the modeling of eye behaviors, thus providing an effective complement to facial expressions. In the experiment, we introduce seven multimodal benchmark protocols for a variety of comprehensive evaluations of the EMER dataset. The results show that the EMERT outperforms other state-of-the-art multimodal methods by a great margin, revealing the importance of modeling eye behaviors for robust ER. To sum up, we provide a comprehensive analysis of the importance of eye behaviors in ER, advancing the study on addressing the gap between FER and ER for more robust ER performance.

* The paper is part of ongoing work and we request to withdraw it from arXiv to revise it further. And The paper was submitted without agreement from all co-authors

Via

Access Paper or Ask Questions

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Nov 25, 2024

Hanhui Wang, Yihua Zhang, Ruizheng Bai, Yue Zhao, Sijia Liu, Zhengzhong Tu

Figure 1 for Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Figure 2 for Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Figure 3 for Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Figure 4 for Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Abstract:Recent advancements in diffusion models have made generative image editing more accessible, enabling creative edits but raising ethical concerns, particularly regarding malicious edits to human portraits that threaten privacy and identity security. Existing protection methods primarily rely on adversarial perturbations to nullify edits but often fail against diverse editing requests. We propose FaceLock, a novel approach to portrait protection that optimizes adversarial perturbations to destroy or significantly alter biometric information, rendering edited outputs biometrically unrecognizable. FaceLock integrates facial recognition and visual perception into perturbation optimization to provide robust protection against various editing attempts. We also highlight flaws in commonly used evaluation metrics and reveal how they can be manipulated, emphasizing the need for reliable assessments of protection. Experiments show FaceLock outperforms baselines in defending against malicious edits and is robust against purification techniques. Ablation studies confirm its stability and broad applicability across diffusion-based editing algorithms. Our work advances biometric defense and sets the foundation for privacy-preserving practices in image editing. The code is available at: https://github.com/taco-group/FaceLock.

* GitHub: https://github.com/taco-group/FaceLock

Via

Access Paper or Ask Questions

Topic:facial recognition

Papers and Code