Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"facial recognition": models, code, and papers

Face Recognition by Fusion of Local and Global Matching Scores using DS Theory: An Evaluation with Uni-classifier and Multi-classifier Paradigm

Feb 02, 2010
Dakshina Ranjan Kisku, Massimo Tistarelli, Jamuna Kanta Sing, Phalguni Gupta

Faces are highly deformable objects which may easily change their appearance over time. Not all face areas are subject to the same variability. Therefore decoupling the information from independent areas of the face is of paramount importance to improve the robustness of any face recognition technique. This paper presents a robust face recognition technique based on the extraction and matching of SIFT features related to independent face areas. Both a global and local (as recognition from parts) matching strategy is proposed. The local strategy is based on matching individual salient facial SIFT features as connected to facial landmarks such as the eyes and the mouth. As for the global matching strategy, all SIFT features are combined together to form a single feature. In order to reduce the identification errors, the Dempster-Shafer decision theory is applied to fuse the two matching techniques. The proposed algorithms are evaluated with the ORL and the IITK face databases. The experimental results demonstrate the effectiveness and potential of the proposed face recognition techniques also in the case of partially occluded faces or with missing information.

* 7 pages, 6 figures, IEEE Computer Vision and Pattern Recognition Workshop on Biometrics 
Access Paper or Ask Questions

Attentional Feature-Pair Relation Networks for Accurate Face Recognition

Aug 17, 2019
Bong-Nam Kang, Yonghyun Kim, Bongjin Jun, Daijin Kim

Human face recognition is one of the most important research areas in biometrics. However, the robust face recognition under a drastic change of the facial pose, expression, and illumination is a big challenging problem for its practical application. Such variations make face recognition more difficult. In this paper, we propose a novel face recognition method, called Attentional Feature-pair Relation Network (AFRN), which represents the face by the relevant pairs of local appearance block features with their attention scores. The AFRN represents the face by all possible pairs of the 9x9 local appearance block features, the importance of each pair is considered by the attention map that is obtained from the low-rank bilinear pooling, and each pair is weighted by its corresponding attention score. To increase the accuracy, we select top-K pairs of local appearance block features as relevant facial information and drop the remaining irrelevant. The weighted top-K pairs are propagated to extract the joint feature-pair relation by using bilinear attention network. In experiments, we show the effectiveness of the proposed AFRN and achieve the outstanding performance in the 1:1 face verification and 1:N face identification tasks compared to existing state-of-the-art methods on the challenging LFW, YTF, CALFW, CPLFW, CFP, AgeDB, IJB-A, IJB-B, and IJB-C datasets.

* To appear in ICCV 2019 
Access Paper or Ask Questions

Efficient Facial Expression Analysis For Dimensional Affect Recognition Using Geometric Features

Jun 15, 2021
Vassilios Vonikakis, Stefan Winkler

Despite their continued popularity, categorical approaches to affect recognition have limitations, especially in real-life situations. Dimensional models of affect offer important advantages for the recognition of subtle expressions and more fine-grained analysis. We introduce a simple but effective facial expression analysis (FEA) system for dimensional affect, solely based on geometric features and Partial Least Squares (PLS) regression. The system jointly learns to estimate Arousal and Valence ratings from a set of facial images. The proposed approach is robust, efficient, and exhibits comparable performance to contemporary deep learning models, while requiring a fraction of the computational resources.

Access Paper or Ask Questions

CASIA-Face-Africa: A Large-scale African Face Image Database

May 11, 2021
Jawad Muhammad, Yunlong Wang, Caiyong Wang, Kunbo Zhang, Zhenan Sun

Face recognition is a popular and well-studied area with wide applications in our society. However, racial bias had been proven to be inherent in most State Of The Art (SOTA) face recognition systems. Many investigative studies on face recognition algorithms have reported higher false positive rates of African subjects cohorts than the other cohorts. Lack of large-scale African face image databases in public domain is one of the main restrictions in studying the racial bias problem of face recognition. To this end, we collect a face image database namely CASIA-Face-Africa which contains 38,546 images of 1,183 African subjects. Multi-spectral cameras are utilized to capture the face images under various illumination settings. Demographic attributes and facial expressions of the subjects are also carefully recorded. For landmark detection, each face image in the database is manually labeled with 68 facial keypoints. A group of evaluation protocols are constructed according to different applications, tasks, partitions and scenarios. The performances of SOTA face recognition algorithms without re-training are reported as baselines. The proposed database along with its face landmark annotations, evaluation protocols and preliminary results form a good benchmark to study the essential aspects of face biometrics for African subjects, especially face image preprocessing, face feature analysis and matching, facial expression recognition, sex/age estimation, ethnic classification, face image generation, etc. The database can be downloaded from our

* This paper has been accepted for publication in the journal IEEE TIFS 
Access Paper or Ask Questions

Improving Facial Attribute Recognition by Group and Graph Learning

May 28, 2021
Zhenghao Chen, Shuhang Gu, Feng Zhu, Jing Xu, Rui Zhao

Exploiting the relationships between attributes is a key challenge for improving multiple facial attribute recognition. In this work, we are concerned with two types of correlations that are spatial and non-spatial relationships. For the spatial correlation, we aggregate attributes with spatial similarity into a part-based group and then introduce a Group Attention Learning to generate the group attention and the part-based group feature. On the other hand, to discover the non-spatial relationship, we model a group-based Graph Correlation Learning to explore affinities of predefined part-based groups. We utilize such affinity information to control the communication between all groups and then refine the learned group features. Overall, we propose a unified network called Multi-scale Group and Graph Network. It incorporates these two newly proposed learning strategies and produces coarse-to-fine graph-based group features for improving facial attribute recognition. Comprehensive experiments demonstrate that our approach outperforms the state-of-the-art methods.

* ICME2021(Oral) 
Access Paper or Ask Questions

Web-based visualisation of head pose and facial expressions changes: monitoring human activity using depth data

Mar 16, 2017
Grigorios Kalliatakis, Nikolaos Vidakis, Georgios Triantafyllidis

Despite significant recent advances in the field of head pose estimation and facial expression recognition, raising the cognitive level when analysing human activity presents serious challenges to current concepts. Motivated by the need of generating comprehensible visual representations from different sets of data, we introduce a system capable of monitoring human activity through head pose and facial expression changes, utilising an affordable 3D sensing technology (Microsoft Kinect sensor). An approach build on discriminative random regression forests was selected in order to rapidly and accurately estimate head pose changes in unconstrained environment. In order to complete the secondary process of recognising four universal dominant facial expressions (happiness, anger, sadness and surprise), emotion recognition via facial expressions (ERFE) was adopted. After that, a lightweight data exchange format (JavaScript Object Notation-JSON) is employed, in order to manipulate the data extracted from the two aforementioned settings. Such mechanism can yield a platform for objective and effortless assessment of human activity within the context of serious gaming and human-computer interaction.

* 8th Computer Science and Electronic Engineering, (CEEC 2016), University of Essex, UK, 6 pages 
Access Paper or Ask Questions

A 3D GAN for Improved Large-pose Facial Recognition

Dec 18, 2020
Richard T. Marriott, Sami Romdhani, Liming Chen

Facial recognition using deep convolutional neural networks relies on the availability of large datasets of face images. Many examples of identities are needed, and for each identity, a large variety of images are needed in order for the network to learn robustness to intra-class variation. In practice, such datasets are difficult to obtain, particularly those containing adequate variation of pose. Generative Adversarial Networks (GANs) provide a potential solution to this problem due to their ability to generate realistic, synthetic images. However, recent studies have shown that current methods of disentangling pose from identity are inadequate. In this work we incorporate a 3D morphable model into the generator of a GAN in order to learn a nonlinear texture model from in-the-wild images. This allows generation of new, synthetic identities, and manipulation of pose and expression without compromising the identity. Our synthesised data is used to augment training of facial recognition networks with performance evaluated on the challenging CFPW and Cross-Pose LFW datasets.

Access Paper or Ask Questions

Baseline CNN structure analysis for facial expression recognition

Nov 14, 2016
Minchul Shin, Munsang Kim, Dong-Soo Kwon

We present a baseline convolutional neural network (CNN) structure and image preprocessing methodology to improve facial expression recognition algorithm using CNN. To analyze the most efficient network structure, we investigated four network structures that are known to show good performance in facial expression recognition. Moreover, we also investigated the effect of input image preprocessing methods. Five types of data input (raw, histogram equalization, isotropic smoothing, diffusion-based normalization, difference of Gaussian) were tested, and the accuracy was compared. We trained 20 different CNN models (4 networks x 5 data input types) and verified the performance of each network with test images from five different databases. The experiment result showed that a three-layer structure consisting of a simple convolutional and a max pooling layer with histogram equalization image input was the most efficient. We describe the detailed training procedure and analyze the result of the test accuracy based on considerable observation.

* 6 pages, RO-MAN2016 Conference 
Access Paper or Ask Questions

Measuring Hidden Bias within Face Recognition via Racial Phenotypes

Oct 19, 2021
Seyma Yucer, Furkan Tektas, Noura Al Moubayed, Toby P. Breckon

Recent work reports disparate performance for intersectional racial groups across face recognition tasks: face verification and identification. However, the definition of those racial groups has a significant impact on the underlying findings of such racial bias analysis. Previous studies define these groups based on either demographic information (e.g. African, Asian etc.) or skin tone (e.g. lighter or darker skins). The use of such sensitive or broad group definitions has disadvantages for bias investigation and subsequent counter-bias solutions design. By contrast, this study introduces an alternative racial bias analysis methodology via facial phenotype attributes for face recognition. We use the set of observable characteristics of an individual face where a race-related facial phenotype is hence specific to the human face and correlated to the racial profile of the subject. We propose categorical test cases to investigate the individual influence of those attributes on bias within face recognition tasks. We compare our phenotype-based grouping methodology with previous grouping strategies and show that phenotype-based groupings uncover hidden bias without reliance upon any potentially protected attributes or ill-defined grouping strategies. Furthermore, we contribute corresponding phenotype attribute category labels for two face recognition tasks: RFW for face verification and VGGFace2 (test set) for face identification.

* published in IEEE Winter Conference on Applications of Computer Vision, WACV, 2022 
Access Paper or Ask Questions

Continuous Emotion Recognition with Spatiotemporal Convolutional Neural Networks

Nov 18, 2020
Thomas Teixeira, Eric Granger, Alessandro Lameiras Koerich

The attention in affect computing and emotion recognition has increased in the last decade. Facial expressions are one of the most powerful ways for depicting specific patterns in human behavior and describing human emotional state. Nevertheless, even for humans, identifying facial expressions is difficult, and automatic video-based systems for facial expression recognition (FER) have often suffered from variations in expressions among individuals, and from a lack of diverse and cross-culture training datasets. However, with video sequences captured in-the-wild and more complex emotion representation such as dimensional models, deep FER systems have the ability to learn more discriminative feature representations. In this paper, we present a survey of the state-of-the-art approaches based on convolutional neural networks (CNNs) for long video sequences recorded with in-the-wild settings, by considering the continuous emotion space of valence and arousal. Since few studies have used 3D-CNN for FER systems and dimensional representation of emotions, we propose an inflated 3D-CNN architecture, allowing for weight inflation of pre-trained 2D-CNN model, in order to operate the essential transfer learning for our video-based application. As a baseline, we also considered a 2D-CNN architecture cascaded network with a long short term memory network, therefore we could finally conclude with a model comparison over two approaches for spatiotemporal representation of facial features and performing the regression of valence/arousal values for emotion prediction. The experimental results on RAF-DB and SEWA-DB datasets have shown that these fine-tuned architectures allow to effectively encode the spatiotemporal information from raw pixel images, and achieved far better results than the current state-of-the-art.

* 28 pages 
Access Paper or Ask Questions