Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"facial recognition": models, code, and papers

Diversity in Faces

Jan 29, 2019
Michele Merler, Nalini Ratha, Rogerio S. Feris, John R. Smith

Face recognition is a long standing challenge in the field of Artificial Intelligence (AI). The goal is to create systems that accurately detect, recognize, verify, and understand human faces. There are significant technical hurdles in making these systems accurate, particularly in unconstrained settings due to confounding factors related to pose, resolution, illumination, occlusion, and viewpoint. However, with recent advances in neural networks, face recognition has achieved unprecedented accuracy, largely built on data-driven deep learning methods. While this is encouraging, a critical aspect that is limiting facial recognition accuracy and fairness is inherent facial diversity. Every face is different. Every face reflects something unique about us. Aspects of our heritage - including race, ethnicity, culture, geography - and our individual identify - age, gender, and other visible manifestations of self-expression, are reflected in our faces. We expect face recognition to work equally accurately for every face. Face recognition needs to be fair. As we rely on data-driven methods to create face recognition technology, we need to ensure necessary balance and coverage in training data. However, there are still scientific questions about how to represent and extract pertinent facial features and quantitatively measure facial diversity. Towards this goal, Diversity in Faces (DiF) provides a data set of one million annotated human face images for advancing the study of facial diversity. The annotations are generated using ten well-established facial coding schemes from the scientific literature. The facial coding schemes provide human-interpretable quantitative measures of facial features. We believe that by making the extracted coding schemes available on a large set of faces, we can accelerate research and development towards creating more fair and accurate facial recognition systems.


Facial Motion Prior Networks for Facial Expression Recognition

Feb 23, 2019
Yuedong Chen, Jianfeng Wang, Shikai Chen, Zhongchao Shi, Jianfei Cai

Deep learning based facial expression recognition (FER) has received a lot of attention in the past few years. Most of the existing deep learning based FER methods do not consider domain knowledge well, which thereby fail to extract representative features. In this work, we propose a novel FER framework, named Facial Motion Prior Networks (FMPN). Particularly, we introduce an addition branch to generate a facial mask so as to focus on facial muscle moving regions. To guide the facial mask learning, we propose to incorporate prior domain knowledge by using the average differences between neutral faces and the corresponding expressive faces as the guidance. Extensive experiments on four facial expression benchmark datasets demonstrate the effectiveness of the proposed method, compared with the state-of-the-art approaches.


Multi-task Cross Attention Network in Facial Behavior Analysis

Jul 21, 2022
Dang-Khanh Nguyen, Sudarshan Pant, Ngoc-Huynh Ho, Guee-Sang Lee, Soo-Huyng Kim, Hyung-Jeong Yang

Facial behavior analysis is a broad topic with various categories such as facial emotion recognition, age and gender recognition, ... Many studies focus on individual tasks while the multi-task learning approach is still open and requires more research. In this paper, we present our solution and experiment result for the Multi-Task Learning challenge of the Affective Behavior Analysis in-the-wild competition. The challenge is a combination of three tasks: action unit detection, facial expression recognition and valance-arousal estimation. To address this challenge, we introduce a cross-attentive module to improve multi-task learning performance. Additionally, a facial graph is applied to capture the association among action units. As a result, we achieve the evaluation measure of 1.24 on the validation data provided by the organizers, which is better than the baseline result of 0.30.


Fawkes: Protecting Personal Privacy against Unauthorized Deep Learning Models

Feb 19, 2020
Shawn Shan, Emily Wenger, Jiayun Zhang, Huiying Li, Haitao Zheng, Ben Y. Zhao

Today's proliferation of powerful facial recognition models poses a real threat to personal privacy. As demonstrated, anyone can canvas the Internet for data, and train highly accurate facial recognition models of us without our knowledge. We need tools to protect ourselves from unauthorized facial recognition systems and their numerous potential misuses. Unfortunately, work in related areas are limited in practicality and effectiveness. In this paper, we propose Fawkes, a system that allow individuals to inoculate themselves against unauthorized facial recognition models. Fawkes achieves this by helping users adding imperceptible pixel-level changes (we call them "cloaks") to their own photos before publishing them online. When collected by a third-party "tracker" and used to train facial recognition models, these "cloaked" images produce functional models that consistently misidentify the user. We experimentally prove that Fawkes provides 95+% protection against user recognition regardless of how trackers train their models. Even when clean, uncloaked images are "leaked" to the tracker and used for training, Fawkes can still maintain a 80+% protection success rate. In fact, we perform real experiments against today's state-of-the-art facial recognition services and achieve 100% success. Finally, we show that Fawkes is robust against a variety of countermeasures that try to detect or disrupt cloaks.


On Improving the Generalization of Face Recognition in the Presence of Occlusions

Jun 11, 2020
Xiang Xu, Nikolaos Sarafianos, Ioannis A. Kakadiaris

In this paper, we address a key limitation of existing 2D face recognition methods: robustness to occlusions. To accomplish this task, we systematically analyzed the impact of facial attributes on the performance of a state-of-the-art face recognition method and through extensive experimentation, quantitatively analyzed the performance degradation under different types of occlusion. Our proposed Occlusion-aware face REcOgnition (OREO) approach learned discriminative facial templates despite the presence of such occlusions. First, an attention mechanism was proposed that extracted local identity-related region. The local features were then aggregated with the global representations to form a single template. Second, a simple, yet effective, training strategy was introduced to balance the non-occluded and occluded facial images. Extensive experiments demonstrated that OREO improved the generalization ability of face recognition under occlusions by (10.17%) in a single-image-based setting and outperformed the baseline by approximately (2%) in terms of rank-1 accuracy in an image-set-based scenario.

* Technical Report 

SenTion: A framework for Sensing Facial Expressions

Aug 16, 2016
Rahul Islam, Karan Ahuja, Sandip Karmakar, Ferdous Barbhuiya

Facial expressions are an integral part of human cognition and communication, and can be applied in various real life applications. A vital precursor to accurate expression recognition is feature extraction. In this paper, we propose SenTion: A framework for sensing facial expressions. We propose a novel person independent and scale invariant method of extracting Inter Vector Angles (IVA) as geometric features, which proves to be robust and reliable across databases. SenTion employs a novel framework of combining geometric (IVA's) and appearance based features (Histogram of Gradients) to create a hybrid model, that achieves state of the art recognition accuracy. We evaluate the performance of SenTion on two famous face expression data set, namely: CK+ and JAFFE; and subsequently evaluate the viability of facial expression systems by a user study. Extensive experiments showed that SenTion framework yielded dramatic improvements in facial expression recognition and could be employed in real-world applications with low resolution imaging and minimal computational resources in real-time, achieving 15-18 fps on a 2.4 GHz CPU with no GPU.


Coupled Learning for Facial Deblur

Apr 18, 2019
Dayong Tian, Dacheng Tao

Blur in facial images significantly impedes the efficiency of recognition approaches. However, most existing blind deconvolution methods cannot generate satisfactory results due to their dependence on strong edges, which are sufficient in natural images but not in facial images. In this paper, we represent point spread functions (PSFs) by the linear combination of a set of pre-defined orthogonal PSFs, and similarly, an estimated intrinsic (EI) sharp face image is represented by the linear combination of a set of pre-defined orthogonal face images. In doing so, PSF and EI estimation is simplified to discovering two sets of linear combination coefficients, which are simultaneously found by our proposed coupled learning algorithm. To make our method robust to different types of blurry face images, we generate several candidate PSFs and EIs for a test image, and then, a non-blind deconvolution method is adopted to generate more EIs by those candidate PSFs. Finally, we deploy a blind image quality assessment metric to automatically select the optimal EI. Thorough experiments on the facial recognition technology database, extended Yale face database B, CMU pose, illumination, and expression (PIE) database, and face recognition grand challenge database version 2.0 demonstrate that the proposed approach effectively restores intrinsic sharp face images and, consequently, improves the performance of face recognition.


Human-Centered Emotion Recognition in Animated GIFs

Apr 27, 2019
Zhengyuan Yang, Yixuan Zhang, Jiebo Luo

As an intuitive way of expression emotion, the animated Graphical Interchange Format (GIF) images have been widely used on social media. Most previous studies on automated GIF emotion recognition fail to effectively utilize GIF's unique properties, and this potentially limits the recognition performance. In this study, we demonstrate the importance of human related information in GIFs and conduct human-centered GIF emotion recognition with a proposed Keypoint Attended Visual Attention Network (KAVAN). The framework consists of a facial attention module and a hierarchical segment temporal module. The facial attention module exploits the strong relationship between GIF contents and human characters, and extracts frame-level visual feature with a focus on human faces. The Hierarchical Segment LSTM (HS-LSTM) module is then proposed to better learn global GIF representations. Our proposed framework outperforms the state-of-the-art on the MIT GIFGIF dataset. Furthermore, the facial attention module provides reliable facial region mask predictions, which improves the model's interpretability.

* Accepted to IEEE International Conference on Multimedia and Expo (ICME) 2019 

A Statistical Nonparametric Approach of Face Recognition: Combination of Eigenface & Modified k-Means Clustering

Apr 07, 2011
Soumen Bag, Soumen Barik, Prithwiraj Sen, Gautam Sanyal

Facial expressions convey non-verbal cues, which play an important role in interpersonal relations. Automatic recognition of human face based on facial expression can be an important component of natural human-machine interface. It may also be used in behavioural science. Although human can recognize the face practically without any effort, but reliable face recognition by machine is a challenge. This paper presents a new approach for recognizing the face of a person considering the expressions of the same human face at different instances of time. This methodology is developed combining Eigenface method for feature extraction and modified k-Means clustering for identification of the human face. This method endowed the face recognition without using the conventional distance measure classifiers. Simulation results show that proposed face recognition using perception of k-Means clustering is useful for face images with different facial expressions.

* 7 pages, 2 figures. In proceedings of the Second International Conference on Information Processing (ICIP), pp. 198-204, Bangalore, India, 2008 

Unique Faces Recognition in Videos

Jun 10, 2020
Jiahao Huo, Terence L van Zyl

This paper tackles face recognition in videos employing metric learning methods and similarity ranking models. The paper compares the use of the Siamese network with contrastive loss and Triplet Network with triplet loss implementing the following architectures: Google/Inception architecture, 3D Convolutional Network (C3D), and a 2-D Long short-term memory (LSTM) Recurrent Neural Network. We make use of still images and sequences from videos for training the networks and compare the performances implementing the above architectures. The dataset used was the YouTube Face Database designed for investigating the problem of face recognition in videos. The contribution of this paper is two-fold: to begin, the experiments have established 3-D Convolutional networks and 2-D LSTMs with the contrastive loss on image sequences do not outperform Google/Inception architecture with contrastive loss in top $n$ rank face retrievals with still images. However, the 3-D Convolution networks and 2-D LSTM with triplet Loss outperform the Google/Inception with triplet loss in top $n$ rank face retrievals on the dataset; second, a Support Vector Machine (SVM) was used in conjunction with the CNNs' learned feature representations for facial identification. The results show that feature representation learned with triplet loss is significantly better for n-shot facial identification compared to contrastive loss. The most useful feature representations for facial identification are from the 2-D LSTM with triplet loss. The experiments show that learning spatio-temporal features from video sequences is beneficial for facial recognition in videos.

* Paper was accepted into Fusion 2020 conference but will only be published after the virtual conference in July 2020. 7 pages long