What is facial recognition? Facial recognition is an AI-based technique for identifying or confirming an individual's identity using their face. It maps facial features from an image or video and then compares the information with a collection of known faces to find a match.
Papers and Code
Nov 14, 2024
Abstract:Differentiable rendering methods have emerged as a promising means for generating photo-realistic and physically plausible adversarial attacks by manipulating 3D objects and scenes that can deceive deep neural networks (DNNs). Recently, differentiable rendering capabilities have evolved significantly into a diverse landscape of libraries, such as Mitsuba, PyTorch3D, and methods like Neural Radiance Fields and 3D Gaussian Splatting for solving inverse rendering problems that share conceptually similar properties commonly used to attack DNNs, such as back-propagation and optimization. However, the adversarial machine learning research community has not yet fully explored or understood such capabilities for generating attacks. Some key reasons are that researchers often have different attack goals, such as misclassification or misdetection, and use different tasks to accomplish these goals by manipulating different representation in a scene, such as the mesh or texture of an object. This survey adopts a task-oriented unifying framework that systematically summarizes common tasks, such as manipulating textures, altering illumination, and modifying 3D meshes to exploit vulnerabilities in DNNs. Our framework enables easy comparison of existing works, reveals research gaps and spotlights exciting future research directions in this rapidly evolving field. Through focusing on how these tasks enable attacks on various DNNs such as image classification, facial recognition, object detection, optical flow and depth estimation, our survey helps researchers and practitioners better understand the vulnerabilities of computer vision systems against photorealistic adversarial attacks that could threaten real-world applications.
Via

Nov 18, 2024
Abstract:Accurate and fast recognition of forgeries is an issue of great importance in the fields of artificial intelligence, image processing and object detection. Recognition of forgeries of facial imagery is the process of classifying and defining the faces in it by analyzing real-world facial images. This process is usually accomplished by extracting features from an image, using classifier algorithms, and correctly interpreting the results. Recognizing forgeries of facial imagery correctly can encounter many different challenges. For example, factors such as changing lighting conditions, viewing faces from different angles can affect recognition performance, and background complexity and perspective changes in facial images can make accurate recognition difficult. Despite these difficulties, significant progress has been made in the field of forgery detection. Deep learning algorithms, especially Convolutional Neural Networks (CNNs), have significantly improved forgery detection performance. This study focuses on image processing-based forgery detection using Fake-Vs-Real-Faces (Hard) [10] and 140k Real and Fake Faces [61] data sets. Both data sets consist of two classes containing real and fake facial images. In our study, two lightweight deep learning models are proposed to conduct forgery detection using these images. Additionally, 8 different pretrained CNN architectures were tested on both data sets and the results were compared with newly developed lightweight CNN models. It's shown that the proposed lightweight deep learning models have minimum number of layers. It's also shown that the proposed lightweight deep learning models detect forgeries of facial imagery accurately, and computationally efficiently. Although the data set consists only of face images, the developed models can also be used in other two-class object recognition problems.
* 13 pages, 6 figures, 10 tables
Via

Oct 29, 2024
Abstract:Automated Facial Expression Recognition (FER) is challenging due to intra-class variations and inter-class similarities. FER can be especially difficult when facial expressions reflect a mixture of various emotions (aka compound expressions). Existing FER datasets, such as AffectNet, provide discrete emotion labels (hard-labels), where a single category of emotion is assigned to an expression. To alleviate inter- and intra-class challenges, as well as provide a better facial expression descriptor, we propose a new approach to create FER datasets through a labeling method in which an image is labeled with more than one emotion (called soft-labels), each with different confidences. Specifically, we introduce the notion of soft-labels for facial expression datasets, a new approach to affective computing for more realistic recognition of facial expressions. To achieve this goal, we propose a novel methodology to accurately calculate soft-labels: a vector representing the extent to which multiple categories of emotion are simultaneously present within a single facial expression. Finding smoother decision boundaries, enabling multi-labeling, and mitigating bias and imbalanced data are some of the advantages of our proposed method. Building upon AffectNet, we introduce AffectNet+, the next-generation facial expression dataset. This dataset contains soft-labels, three categories of data complexity subsets, and additional metadata such as age, gender, ethnicity, head pose, facial landmarks, valence, and arousal. AffectNet+ will be made publicly accessible to researchers.
Via

Oct 25, 2024
Abstract:Automatically recognizing emotional intent using facial expression has been a thoroughly investigated topic in the realm of computer vision. Facial Expression Recognition (FER), being a supervised learning task, relies heavily on substantially large data exemplifying various socio-cultural demographic attributes. Over the past decade, several real-world in-the-wild FER datasets that have been proposed were collected through crowd-sourcing or web-scraping. However, most of these practically used datasets employ a manual annotation methodology for labeling emotional intent, which inherently propagates individual demographic biases. Moreover, these datasets also lack an equitable representation of various socio-cultural demographic groups, thereby inducing a class imbalance. Bias analysis and its mitigation have been investigated across multiple domains and problem settings, however, in the FER domain, this is a relatively lesser explored area. This work leverages representation learning based on latent spaces to mitigate bias in facial expression recognition systems, thereby enhancing a deep learning model's fairness and overall accuracy.
Via

Nov 26, 2024
Abstract:User authentication is a pivotal element in security systems. Conventional methods including passwords, personal identification numbers, and identification tags are increasingly vulnerable to cyber-attacks. This paper suggests a paradigm shift towards biometric identification technology that leverages unique physiological or behavioral characteristics for user authenticity verification. Nevertheless, biometric solutions like fingerprints, iris patterns, facial and voice recognition are also susceptible to forgery and deception. We propose using Electroencephalogram (EEG) signals for individual identification to address this challenge. Derived from unique brain activities, these signals offer promising authentication potential and provide a novel means for liveness detection, thereby mitigating spoofing attacks. This study employs a public dataset initially compiled for fatigue analysis, featuring EEG data from 12 subjects recorded via an eight-channel OpenBCI helmet. This dataset extracts salient features from the EEG signals and trains a supervised multiclass Support Vector Machine classifier. Upon evaluation, the classifier model achieves a maximum accuracy of 92.9\%, leveraging ten features from each channel. Collectively, these findings highlight the viability of machine learning in implementing real-world, EEG-based biometric identification systems, thereby advancing user authentication technology.
* 2023 International Conference on Electrical, Computer and Energy
Technologies (ICECET)
Via

Jan 13, 2025
Abstract:This study explores the use of Generative Adversarial Networks (GANs) to detect AI deepfakes and fraudulent activities in online payment systems. With the growing prevalence of deepfake technology, which can manipulate facial features in images and videos, the potential for fraud in online transactions has escalated. Traditional security systems struggle to identify these sophisticated forms of fraud. This research proposes a novel GAN-based model that enhances online payment security by identifying subtle manipulations in payment images. The model is trained on a dataset consisting of real-world online payment images and deepfake images generated using advanced GAN architectures, such as StyleGAN and DeepFake. The results demonstrate that the proposed model can accurately distinguish between legitimate transactions and deepfakes, achieving a high detection rate above 95%. This approach significantly improves the robustness of payment systems against AI-driven fraud. The paper contributes to the growing field of digital security, offering insights into the application of GANs for fraud detection in financial services. Keywords- Payment Security, Image Recognition, Generative Adversarial Networks, AI Deepfake, Fraudulent Activities
* The paper will be published and indexed by IEEE at 2025 8th
International Conference on Advanced Algorithms and Control Engineering
(ICAACE 2025)
Via

Nov 07, 2024
Abstract:Machine learning models are prone to adversarial attacks, where inputs can be manipulated in order to cause misclassifications. While previous research has focused on techniques like Generative Adversarial Networks (GANs), there's limited exploration of GANs and Synthetic Minority Oversampling Technique (SMOTE) in text and image classification models to perform adversarial attacks. Our study addresses this gap by training various machine learning models and using GANs and SMOTE to generate additional data points aimed at attacking text classification models. Furthermore, we extend our investigation to face recognition models, training a Convolutional Neural Network(CNN) and subjecting it to adversarial attacks with fast gradient sign perturbations on key features identified by GradCAM, a technique used to highlight key image characteristics CNNs use in classification. Our experiments reveal a significant vulnerability in classification models. Specifically, we observe a 20 % decrease in accuracy for the top-performing text classification models post-attack, along with a 30 % decrease in facial recognition accuracy. This highlights the susceptibility of these models to manipulation of input data. Adversarial attacks not only compromise the security but also undermine the reliability of machine learning systems. By showcasing the impact of adversarial attacks on both text classification and face recognition models, our study underscores the urgent need for develop robust defenses against such vulnerabilities.
* Accepted for presentation at Electronic Imaging Conference 2025
Via

Dec 24, 2024
Abstract:While face recognition (FR) models have brought remarkable convenience in face verification and identification, they also pose substantial privacy risks to the public. Existing facial privacy protection schemes usually adopt adversarial examples to disrupt face verification of FR models. However, these schemes often suffer from weak transferability against black-box FR models and permanently damage the identifiable information that cannot fulfill the requirements of authorized operations such as forensics and authentication. To address these limitations, we propose ErasableMask, a robust and erasable privacy protection scheme against black-box FR models. Specifically, via rethinking the inherent relationship between surrogate FR models, ErasableMask introduces a novel meta-auxiliary attack, which boosts black-box transferability by learning more general features in a stable and balancing optimization strategy. It also offers a perturbation erasion mechanism that supports the erasion of semantic perturbations in protected face without degrading image quality. To further improve performance, ErasableMask employs a curriculum learning strategy to mitigate optimization conflicts between adversarial attack and perturbation erasion. Extensive experiments on the CelebA-HQ and FFHQ datasets demonstrate that ErasableMask achieves the state-of-the-art performance in transferability, achieving over 72% confidence on average in commercial FR systems. Moreover, ErasableMask also exhibits outstanding perturbation erasion performance, achieving over 90% erasion success rate.
Via

Nov 06, 2024
Abstract:This study is focused on enhancing the Haar Cascade Algorithm to decrease the false positive and false negative rate in face matching and face detection to increase the accuracy rate even under challenging conditions. The face recognition library was implemented with Haar Cascade Algorithm in which the 128-dimensional vectors representing the unique features of a face are encoded. A subprocess was applied where the grayscale image from Haar Cascade was converted to RGB to improve the face encoding. Logical process and face filtering are also used to decrease non-face detection. The Enhanced Haar Cascade Algorithm produced a 98.39% accuracy rate (21.39% increase), 63.59% precision rate, 98.30% recall rate, and 72.23% in F1 Score. In comparison, the Haar Cascade Algorithm achieved a 46.70% to 77.00% accuracy rate, 44.15% precision rate, 98.61% recall rate, and 47.01% in F1 Score. Both algorithms used the Confusion Matrix Test with 301,950 comparisons using the same dataset of 550 images. The 98.39% accuracy rate shows a significant decrease in false positive and false negative rates in facial recognition. Face matching and face detection are more accurate in images with complex backgrounds, lighting variations, and occlusions, or even those with similar attributes.
Via

Oct 23, 2024
Abstract:In recent years, Facial Expression Recognition (FER) has gained increasing attention. Most current work focuses on supervised learning, which requires a large amount of labeled and diverse images, while FER suffers from the scarcity of large, diverse datasets and annotation difficulty. To address these problems, we focus on utilizing large unlabeled Face Recognition (FR) datasets to boost semi-supervised FER. Specifically, we first perform face reconstruction pre-training on large-scale facial images without annotations to learn features of facial geometry and expression regions, followed by two-stage fine-tuning on FER datasets with limited labels. In addition, to further alleviate the scarcity of labeled and diverse images, we propose a Mixup-based data augmentation strategy tailored for facial images, and the loss weights of real and virtual images are determined according to the intersection-over-union (IoU) of the faces in the two images. Experiments on RAF-DB, AffectNet, and FERPlus show that our method outperforms existing semi-supervised FER methods and achieves new state-of-the-art performance. Remarkably, with only 5%, 25% training sets,our method achieves 64.02% on AffectNet,and 88.23% on RAF-DB, which is comparable to fully supervised state-of-the-art methods. Codes will be made publicly available at https://github.com/zhelishisongjie/SSFER.
Via
