Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:facial recognition

What is facial recognition? Facial recognition is an AI-based technique for identifying or confirming an individual's identity using their face. It maps facial features from an image or video and then compares the information with a collection of known faces to find a match.

Improving Bias in Facial Attribute Classification: A Combined Impact of KL Divergence induced Loss Function and Dual Attention

Oct 15, 2024

Shweta Patel, Dakshina Ranjan Kisku

Figure 1 for Improving Bias in Facial Attribute Classification: A Combined Impact of KL Divergence induced Loss Function and Dual Attention

Figure 2 for Improving Bias in Facial Attribute Classification: A Combined Impact of KL Divergence induced Loss Function and Dual Attention

Figure 3 for Improving Bias in Facial Attribute Classification: A Combined Impact of KL Divergence induced Loss Function and Dual Attention

Figure 4 for Improving Bias in Facial Attribute Classification: A Combined Impact of KL Divergence induced Loss Function and Dual Attention

Abstract:Ensuring that AI-based facial recognition systems produce fair predictions and work equally well across all demographic groups is crucial. Earlier systems often exhibited demographic bias, particularly in gender and racial classification, with lower accuracy for women and individuals with darker skin tones. To tackle this issue and promote fairness in facial recognition, researchers have introduced several bias-mitigation techniques for gender classification and related algorithms. However, many challenges remain, such as data diversity, balancing fairness with accuracy, disparity, and bias measurement. This paper presents a method using a dual attention mechanism with a pre-trained Inception-ResNet V1 model, enhanced by KL-divergence regularization and a cross-entropy loss function. This approach reduces bias while improving accuracy and computational efficiency through transfer learning. The experimental results show significant improvements in both fairness and classification accuracy, providing promising advances in addressing bias and enhancing the reliability of facial recognition systems.

* 15 pages, 9 figures, 5 tables

Via

Access Paper or Ask Questions

Bridging the Gaps: Utilizing Unlabeled Face Recognition Datasets to Boost Semi-Supervised Facial Expression Recognition

Oct 23, 2024

Jie Song, Mengqiao He, Jinhua Feng, Bairong Shen

Abstract:In recent years, Facial Expression Recognition (FER) has gained increasing attention. Most current work focuses on supervised learning, which requires a large amount of labeled and diverse images, while FER suffers from the scarcity of large, diverse datasets and annotation difficulty. To address these problems, we focus on utilizing large unlabeled Face Recognition (FR) datasets to boost semi-supervised FER. Specifically, we first perform face reconstruction pre-training on large-scale facial images without annotations to learn features of facial geometry and expression regions, followed by two-stage fine-tuning on FER datasets with limited labels. In addition, to further alleviate the scarcity of labeled and diverse images, we propose a Mixup-based data augmentation strategy tailored for facial images, and the loss weights of real and virtual images are determined according to the intersection-over-union (IoU) of the faces in the two images. Experiments on RAF-DB, AffectNet, and FERPlus show that our method outperforms existing semi-supervised FER methods and achieves new state-of-the-art performance. Remarkably, with only 5%, 25% training sets,our method achieves 64.02% on AffectNet,and 88.23% on RAF-DB, which is comparable to fully supervised state-of-the-art methods. Codes will be made publicly available at https://github.com/zhelishisongjie/SSFER.

Via

Access Paper or Ask Questions

ErasableMask: A Robust and Erasable Privacy Protection Scheme against Black-box Face Recognition Models

Dec 24, 2024

Sipeng Shen, Yunming Zhang, Dengpan Ye, Xiuwen Shi, Long Tang, Haoran Duan, Jiacheng Deng, Ziyi Liu

Figure 1 for ErasableMask: A Robust and Erasable Privacy Protection Scheme against Black-box Face Recognition Models

Figure 2 for ErasableMask: A Robust and Erasable Privacy Protection Scheme against Black-box Face Recognition Models

Figure 3 for ErasableMask: A Robust and Erasable Privacy Protection Scheme against Black-box Face Recognition Models

Figure 4 for ErasableMask: A Robust and Erasable Privacy Protection Scheme against Black-box Face Recognition Models

Abstract:While face recognition (FR) models have brought remarkable convenience in face verification and identification, they also pose substantial privacy risks to the public. Existing facial privacy protection schemes usually adopt adversarial examples to disrupt face verification of FR models. However, these schemes often suffer from weak transferability against black-box FR models and permanently damage the identifiable information that cannot fulfill the requirements of authorized operations such as forensics and authentication. To address these limitations, we propose ErasableMask, a robust and erasable privacy protection scheme against black-box FR models. Specifically, via rethinking the inherent relationship between surrogate FR models, ErasableMask introduces a novel meta-auxiliary attack, which boosts black-box transferability by learning more general features in a stable and balancing optimization strategy. It also offers a perturbation erasion mechanism that supports the erasion of semantic perturbations in protected face without degrading image quality. To further improve performance, ErasableMask employs a curriculum learning strategy to mitigate optimization conflicts between adversarial attack and perturbation erasion. Extensive experiments on the CelebA-HQ and FFHQ datasets demonstrate that ErasableMask achieves the state-of-the-art performance in transferability, achieving over 72% confidence on average in commercial FR systems. Moreover, ErasableMask also exhibits outstanding perturbation erasion performance, achieving over 90% erasion success rate.

Via

Access Paper or Ask Questions

Detection of AI Deepfake and Fraud in Online Payments Using GAN-Based Models

Jan 13, 2025

Zong Ke, Shicheng Zhou, Yining Zhou, Chia Hong Chang, Rong Zhang

Figure 1 for Detection of AI Deepfake and Fraud in Online Payments Using GAN-Based Models

Figure 2 for Detection of AI Deepfake and Fraud in Online Payments Using GAN-Based Models

Figure 3 for Detection of AI Deepfake and Fraud in Online Payments Using GAN-Based Models

Figure 4 for Detection of AI Deepfake and Fraud in Online Payments Using GAN-Based Models

Abstract:This study explores the use of Generative Adversarial Networks (GANs) to detect AI deepfakes and fraudulent activities in online payment systems. With the growing prevalence of deepfake technology, which can manipulate facial features in images and videos, the potential for fraud in online transactions has escalated. Traditional security systems struggle to identify these sophisticated forms of fraud. This research proposes a novel GAN-based model that enhances online payment security by identifying subtle manipulations in payment images. The model is trained on a dataset consisting of real-world online payment images and deepfake images generated using advanced GAN architectures, such as StyleGAN and DeepFake. The results demonstrate that the proposed model can accurately distinguish between legitimate transactions and deepfakes, achieving a high detection rate above 95%. This approach significantly improves the robustness of payment systems against AI-driven fraud. The paper contributes to the growing field of digital security, offering insights into the application of GANs for fraud detection in financial services. Keywords- Payment Security, Image Recognition, Generative Adversarial Networks, AI Deepfake, Fraudulent Activities

* The paper will be published and indexed by IEEE at 2025 8th International Conference on Advanced Algorithms and Control Engineering (ICAACE 2025)

Via

Access Paper or Ask Questions

Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age

Oct 31, 2024

Nouar AlDahoul, Myles Joshua Toledo Tan, Harishwar Reddy Kasireddy, Yasir Zaki

Figure 1 for Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age

Figure 2 for Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age

Figure 3 for Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age

Figure 4 for Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age

Abstract:Technologies for recognizing facial attributes like race, gender, age, and emotion have several applications, such as surveillance, advertising content, sentiment analysis, and the study of demographic trends and social behaviors. Analyzing demographic characteristics based on images and analyzing facial expressions have several challenges due to the complexity of humans' facial attributes. Traditional approaches have employed CNNs and various other deep learning techniques, trained on extensive collections of labeled images. While these methods demonstrated effective performance, there remains potential for further enhancements. In this paper, we propose to utilize vision language models (VLMs) such as generative pre-trained transformer (GPT), GEMINI, large language and vision assistant (LLAVA), PaliGemma, and Microsoft Florence2 to recognize facial attributes such as race, gender, age, and emotion from images with human faces. Various datasets like FairFace, AffectNet, and UTKFace have been utilized to evaluate the solutions. The results show that VLMs are competitive if not superior to traditional techniques. Additionally, we propose "FaceScanPaliGemma"--a fine-tuned PaliGemma model--for race, gender, age, and emotion recognition. The results show an accuracy of 81.1%, 95.8%, 80%, and 59.4% for race, gender, age group, and emotion classification, respectively, outperforming pre-trained version of PaliGemma, other VLMs, and SotA methods. Finally, we propose "FaceScanGPT", which is a GPT-4o model to recognize the above attributes when several individuals are present in the image using a prompt engineered for a person with specific facial and/or physical attributes. The results underscore the superior multitasking capability of FaceScanGPT to detect the individual's attributes like hair cut, clothing color, postures, etc., using only a prompt to drive the detection and recognition tasks.

* 52 pages, 13 figures

Via

Access Paper or Ask Questions

Classification in Japanese Sign Language Based on Dynamic Facial Expressions

Nov 10, 2024

Yui Tatsumi, Shoko Tanaka, Shunsuke Akamatsu, Takahiro Shindo, Hiroshi Watanabe

Abstract:Sign language is a visual language expressed through hand movements and non-manual markers. Non-manual markers include facial expressions and head movements. These expressions vary across different nations. Therefore, specialized analysis methods for each sign language are necessary. However, research on Japanese Sign Language (JSL) recognition is limited due to a lack of datasets. The development of recognition models that consider both manual and non-manual features of JSL is crucial for precise and smooth communication with deaf individuals. In JSL, sentence types such as affirmative statements and questions are distinguished by facial expressions. In this paper, we propose a JSL recognition method that focuses on facial expressions. Our proposed method utilizes a neural network to analyze facial features and classify sentence types. Through the experiments, we confirm our method's effectiveness by achieving a classification accuracy of 96.05%.

* 2024 IEEE 13th Global Conference on Consumer Electronics (GCCE 2024)

Via

Access Paper or Ask Questions

OSDFace: One-Step Diffusion Model for Face Restoration

Nov 26, 2024

Jingkai Wang, Jue Gong, Lin Zhang, Zheng Chen, Xing Liu, Hong Gu, Yutong Liu, Yulun Zhang, Xiaokang Yang

Figure 1 for OSDFace: One-Step Diffusion Model for Face Restoration

Figure 2 for OSDFace: One-Step Diffusion Model for Face Restoration

Figure 3 for OSDFace: One-Step Diffusion Model for Face Restoration

Figure 4 for OSDFace: One-Step Diffusion Model for Face Restoration

Abstract:Diffusion models have demonstrated impressive performance in face restoration. Yet, their multi-step inference process remains computationally intensive, limiting their applicability in real-world scenarios. Moreover, existing methods often struggle to generate face images that are harmonious, realistic, and consistent with the subject's identity. In this work, we propose OSDFace, a novel one-step diffusion model for face restoration. Specifically, we propose a visual representation embedder (VRE) to better capture prior information and understand the input face. In VRE, low-quality faces are processed by a visual tokenizer and subsequently embedded with a vector-quantized dictionary to generate visual prompts. Additionally, we incorporate a facial identity loss derived from face recognition to further ensure identity consistency. We further employ a generative adversarial network (GAN) as a guidance model to encourage distribution alignment between the restored face and the ground truth. Experimental results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics, generating high-fidelity, natural face images with high identity consistency. The code and model will be released at https://github.com/jkwang28/OSDFace.

* 8 pages, 6 figures. The code and model will be available at https://github.com/jkwang28/OSDFace

Via

Access Paper or Ask Questions

Local and Global Feature Attention Fusion Network for Face Recognition

Nov 25, 2024

Wang Yu, Wei Wei

Abstract:Recognition of low-quality face images remains a challenge due to invisible or deformation in partial facial regions. For low-quality images dominated by missing partial facial regions, local region similarity contributes more to face recognition (FR). Conversely, in cases dominated by local face deformation, excessive attention to local regions may lead to misjudgments, while global features exhibit better robustness. However, most of the existing FR methods neglect the bias in feature quality of low-quality images introduced by different factors. To address this issue, we propose a Local and Global Feature Attention Fusion (LGAF) network based on feature quality. The network adaptively allocates attention between local and global features according to feature quality and obtains more discriminative and high-quality face features through local and global information complementarity. In addition, to effectively obtain fine-grained information at various scales and increase the separability of facial features in high-dimensional space, we introduce a Multi-Head Multi-Scale Local Feature Extraction (MHMS) module. Experimental results demonstrate that the LGAF achieves the best average performance on $4$ validation sets (CFP-FP, CPLFW, AgeDB, and CALFW), and the performance on TinyFace and SCFace outperforms the state-of-the-art methods (SoTA).

Via

Access Paper or Ask Questions

PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition

Dec 10, 2024

Kartik Narayan, Nithin Gopalakrishnan Nair, Jennifer Xu, Rama Chellappa, Vishal M. Patel

Abstract:Pre-training on large-scale datasets and utilizing margin-based loss functions have been highly successful in training models for high-resolution face recognition. However, these models struggle with low-resolution face datasets, in which the faces lack the facial attributes necessary for distinguishing different faces. Full fine-tuning on low-resolution datasets, a naive method for adapting the model, yields inferior performance due to catastrophic forgetting of pre-trained knowledge. Additionally the domain difference between high-resolution (HR) gallery images and low-resolution (LR) probe images in low resolution datasets leads to poor convergence for a single model to adapt to both gallery and probe after fine-tuning. To this end, we propose PETALface, a Parameter-Efficient Transfer Learning approach for low-resolution face recognition. Through PETALface, we attempt to solve both the aforementioned problems. (1) We solve catastrophic forgetting by leveraging the power of parameter efficient fine-tuning(PEFT). (2) We introduce two low-rank adaptation modules to the backbone, with weights adjusted based on the input image quality to account for the difference in quality for the gallery and probe images. To the best of our knowledge, PETALface is the first work leveraging the powers of PEFT for low resolution face recognition. Extensive experiments demonstrate that the proposed method outperforms full fine-tuning on low-resolution datasets while preserving performance on high-resolution and mixed-quality datasets, all while using only 0.48% of the parameters. Code: https://kartik-3004.github.io/PETALface/

* Accepted to WACV 2025. Project Page: https://kartik-3004.github.io/PETALface/

Via

Access Paper or Ask Questions

SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data

Oct 13, 2024

Xilin He, Cheng Luo, Xiaole Xian, Bing Li, Siyang Song, Muhammad Haris Khan, Weicheng Xie, Linlin Shen, Zongyuan Ge

Figure 1 for SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data

Figure 2 for SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data

Figure 3 for SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data

Figure 4 for SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data

Abstract:Facial expression datasets remain limited in scale due to privacy concerns, the subjectivity of annotations, and the labor-intensive nature of data collection. This limitation poses a significant challenge for developing modern deep learning-based facial expression analysis models, particularly foundation models, that rely on large-scale data for optimal performance. To tackle the overarching and complex challenge, we introduce SynFER (Synthesis of Facial Expressions with Refined Control), a novel framework for synthesizing facial expression image data based on high-level textual descriptions as well as more fine-grained and precise control through facial action units. To ensure the quality and reliability of the synthetic data, we propose a semantic guidance technique to steer the generation process and a pseudo-label generator to help rectify the facial expression labels for the synthetic images. To demonstrate the generation fidelity and the effectiveness of the synthetic data from SynFER, we conduct extensive experiments on representation learning using both synthetic data and real-world data. Experiment results validate the efficacy of the proposed approach and the synthetic data. Notably, our approach achieves a 67.23% classification accuracy on AffectNet when training solely with synthetic data equivalent to the AffectNet training set size, which increases to 69.84% when scaling up to five times the original size. Our code will be made publicly available.

Via

Access Paper or Ask Questions

Topic:facial recognition

Papers and Code