Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"facial recognition": models, code, and papers

Multi-task, multi-label and multi-domain learning with residual convolutional networks for emotion recognition

Feb 19, 2018
Gerard Pons, David Masip

Automated emotion recognition in the wild from facial images remains a challenging problem. Although recent advances in Deep Learning have supposed a significant breakthrough in this topic, strong changes in pose, orientation and point of view severely harm current approaches. In addition, the acquisition of labeled datasets is costly, and current state-of-the-art deep learning algorithms cannot model all the aforementioned difficulties. In this paper, we propose to apply a multi-task learning loss function to share a common feature representation with other related tasks. Particularly we show that emotion recognition benefits from jointly learning a model with a detector of facial Action Units (collective muscle movements). The proposed loss function addresses the problem of learning multiple tasks with heterogeneously labeled data, improving previous multi-task approaches. We validate the proposal using two datasets acquired in non controlled environments, and an application to predict compound facial emotion expressions.

* Preprint submitted to IJCV 
Access Paper or Ask Questions

SVM-based Multiview Face Recognition by Generalization of Discriminant Analysis

Jan 23, 2010
Dakshina Ranjan Kisku, Hunny Mehrotra, Jamuna Kanta Sing, Phalguni Gupta

Identity verification of authentic persons by their multiview faces is a real valued problem in machine vision. Multiview faces are having difficulties due to non-linear representation in the feature space. This paper illustrates the usability of the generalization of LDA in the form of canonical covariate for face recognition to multiview faces. In the proposed work, the Gabor filter bank is used to extract facial features that characterized by spatial frequency, spatial locality and orientation. Gabor face representation captures substantial amount of variations of the face instances that often occurs due to illumination, pose and facial expression changes. Convolution of Gabor filter bank to face images of rotated profile views produce Gabor faces with high dimensional features vectors. Canonical covariate is then used to Gabor faces to reduce the high dimensional feature spaces into low dimensional subspaces. Finally, support vector machines are trained with canonical sub-spaces that contain reduced set of features and perform recognition task. The proposed system is evaluated with UMIST face database. The experiment results demonstrate the efficiency and robustness of the proposed system with high recognition rates.

* International Journal of Computer Systems Science and Engineering (formerly International Journal of Intelligent Systems and Technologies), vol. 3, no. 3, pp. 174--179, 2008 
* 6 pages, 3 figures 
Access Paper or Ask Questions

A Naturalistic Database of Thermal Emotional Facial Expressions and Effects of Induced Emotions on Memory

Mar 29, 2022
Anna Esposito, Vincenzo Capuano, Jiri Mekyska, Marcos Faundez-Zanuy

This work defines a procedure for collecting naturally induced emotional facial expressions through the vision of movie excerpts with high emotional contents and reports experimental data ascertaining the effects of emotions on memory word recognition tasks. The induced emotional states include the four basic emotions of sadness, disgust, happiness, and surprise, as well as the neutral emotional state. The resulting database contains both thermal and visible emotional facial expressions, portrayed by forty Italian subjects and simultaneously acquired by appropriately synchronizing a thermal and a standard visible camera. Each subject's recording session lasted 45 minutes, allowing for each mode (thermal or visible) to collect a minimum of 2000 facial expressions from which a minimum of 400 were selected as highly expressive of each emotion category. The database is available to the scientific community and can be obtained contacting one of the authors. For this pilot study, it was found that emotions and/or emotion categories do not affect individual performance on memory word recognition tasks and temperature changes in the face or in some regions of it do not discriminate among emotional states.

* 2012 Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg 
* 15 pages published in Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., M\"uller, V.C. (eds) Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg 
Access Paper or Ask Questions

Evaluation of the Spatio-Temporal features and GAN for Micro-expression Recognition System

Apr 03, 2019
Sze-Teng Liong, Y. S. Gan, Danna Zheng, Shu-Meng Lic, Hao-Xuan Xua, Han-Zhe Zhang, Ran-Ke Lyu, Kun-Hong Liu

Owing to the development and advancement of artificial intelligence, numerous works were established in the human facial expression recognition system. Meanwhile, the detection and classification of micro-expressions are attracting attentions from various research communities in the recent few years. In this paper, we first review the processes of a conventional optical-flow-based recognition system, which comprised of facial landmarks annotations, optical flow guided images computation, features extraction and emotion class categorization. Secondly, a few approaches have been proposed to improve the feature extraction part, such as exploiting GAN to generate more image samples. Particularly, several variations of optical flow are computed in order to generate optimal images to lead to high recognition accuracy. Next, GAN, a combination of Generator and Discriminator, is utilized to generate new "fake" images to increase the sample size. Thirdly, a modified state-of-the-art Convolutional neural networks is proposed. To verify the effectiveness of the the proposed method, the results are evaluated on spontaneous micro-expression databases, namely SMIC, CASME II and SAMM. Both the F1-score and accuracy performance metrics are reported in this paper.

* 15 pages, 16 figures, 6 tables 
Access Paper or Ask Questions

Nasal Patches and Curves for Expression-robust 3D Face Recognition

Jan 01, 2019
Mehryar Emambakhsh, Adrian Evans

The potential of the nasal region for expression robust 3D face recognition is thoroughly investigated by a novel five-step algorithm. First, the nose tip location is coarsely detected and the face is segmented, aligned and the nasal region cropped. Then, a very accurate and consistent nasal landmarking algorithm detects seven keypoints on the nasal region. In the third step, a feature extraction algorithm based on the surface normals of Gabor-wavelet filtered depth maps is utilised and, then, a set of spherical patches and curves are localised over the nasal region to provide the feature descriptors. The last step applies a genetic algorithm-based feature selector to detect the most stable patches and curves over different facial expressions. The algorithm provides the highest reported nasal region-based recognition ranks on the FRGC, Bosphorus and BU-3DFE datasets. The results are comparable with, and in many cases better than, many state-of-the-art 3D face recognition algorithms, which use the whole facial domain. The proposed method does not rely on sophisticated alignment or denoising steps, is very robust when only one sample per subject is used in the gallery, and does not require a training step for the landmarking algorithm.

* IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 39, no. 5, pp. 995-1007, 2017 
Access Paper or Ask Questions

Group Emotion Recognition Using Machine Learning

May 03, 2019
Samanyou Garg

Automatic facial emotion recognition is a challenging task that has gained significant scientific interest over the past few years, but the problem of emotion recognition for a group of people has been less extensively studied. However, it is slowly gaining popularity due to the massive amount of data available on social networking sites containing images of groups of people participating in various social events. Group emotion recognition is a challenging problem due to obstructions like head and body pose variations, occlusions, variable lighting conditions, variance of actors, varied indoor and outdoor settings and image quality. The objective of this task is to classify a group's perceived emotion as Positive, Neutral or Negative. In this report, we describe our solution which is a hybrid machine learning system that incorporates deep neural networks and Bayesian classifiers. Deep Convolutional Neural Networks (CNNs) work from bottom to top, analysing facial expressions expressed by individual faces extracted from the image. The Bayesian network works from top to bottom, inferring the global emotion for the image, by integrating the visual features of the contents of the image obtained through a scene descriptor. In the final pipeline, the group emotion category predicted by an ensemble of CNNs in the bottom-up module is passed as input to the Bayesian Network in the top-down module and an overall prediction for the image is obtained. Experimental results show that the stated system achieves 65.27% accuracy on the validation set which is in line with state-of-the-art results. As an outcome of this project, a Progressive Web Application and an accompanying Android app with a simple and intuitive user interface are presented, allowing users to test out the system with their own pictures.

Access Paper or Ask Questions

Occlusion Robust Face Recognition Based on Mask Learning with PairwiseDifferential Siamese Network

Aug 17, 2019
Lingxue Song, Dihong Gong, Zhifeng Li, Changsong Liu, Wei Liu

Deep Convolutional Neural Networks (CNNs) have been pushing the frontier of the face recognition research in the past years. However, existing general CNN face models generalize poorly to the scenario of occlusions on variable facial areas. Inspired by the fact that a human visual system explicitly ignores occlusions and only focuses on non-occluded facial areas, we propose a mask learning strategy to find and discard the corrupted feature elements for face recognition. A mask dictionary is firstly established by exploiting the differences between the top convoluted features of occluded and occlusion-free face pairs using an innovatively designed Pairwise Differential Siamese Network (PDSN). Each item of this dictionary captures the correspondence between occluded facial areas and corrupted feature elements, which is named Feature Discarding Mask (FDM). When dealing with a face image with random partial occlusions, we generate its FDM by combining relevant dictionary items and then multiply it with the original features to eliminate those corrupted feature elements. Comprehensive experiments on both synthesized and realistic occluded face datasets show that the proposed approach significantly outperforms the state-of-the-arts.

Access Paper or Ask Questions

Neural Architecture Searching for Facial Attributes-based Depression Recognition

Jan 24, 2022
Mingzhe Chen, Xi Xiao, Bin Zhang, Xinyu Liu, Runiu Lu

Recent studies show that depression can be partially reflected from human facial attributes. Since facial attributes have various data structure and carry different information, existing approaches fail to specifically consider the optimal way to extract depression-related features from each of them, as well as investigates the best fusion strategy. In this paper, we propose to extend Neural Architecture Search (NAS) technique for designing an optimal model for multiple facial attributes-based depression recognition, which can be efficiently and robustly implemented in a small dataset. Our approach first conducts a warmer up step to the feature extractor of each facial attribute, aiming to largely reduce the search space and providing customized architecture, where each feature extractor can be either a Convolution Neural Networks (CNN) or Graph Neural Networks (GNN). Then, we conduct an end-to-end architecture search for all feature extractors and the fusion network, allowing the complementary depression cues to be optimally combined with less redundancy. The experimental results on AVEC 2016 dataset show that the model explored by our approach achieves breakthrough performance with 27\% and 30\% RMSE and MAE improvements over the existing state-of-the-art. In light of these findings, this paper provides solid evidences and a strong baseline for applying NAS to time-series data-based mental health analysis.

Access Paper or Ask Questions

Robust 3D face recognition in presence of pose and partial occlusions or missing parts

Aug 16, 2014
Parama Bagchi, Debotosh Bhattacharjee, Mita Nasipuri

In this paper, we propose a robust 3D face recognition system which can handle pose as well as occlusions in real world. The system at first takes as input, a 3D range image, simultaneously registers it using ICP(Iterative Closest Point) algorithm. ICP used in this work, registers facial surfaces to a common model by minimizing distances between a probe model and a gallery model. However the performance of ICP relies heavily on the initial conditions. Hence, it is necessary to provide an initial registration, which will be improved iteratively and finally converge to the best alignment possible. Once the faces are registered, the occlusions are automatically extracted by thresholding the depth map values of the 3D image. After the occluded regions are detected, restoration is done by Principal Component Analysis (PCA). The restored images, after the removal of occlusions, are then fed to the recognition system for classification purpose. Features are extracted from the reconstructed non-occluded face images in the form of face normals. The experimental results which were obtained on the occluded facial images from the Bosphorus 3D face database, illustrate that our occlusion compensation scheme has attained a recognition accuracy of 91.30%.

* the paper is of 15 pages, International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.4, July 2014 
Access Paper or Ask Questions

Using Photorealistic Face Synthesis and Domain Adaptation to Improve Facial Expression Analysis

May 17, 2019
Behzad Bozorgtabar, Mohammad Saeed Rad, Hazim Kemal Ekenel, Jean-Philippe Thiran

Cross-domain synthesizing realistic faces to learn deep models has attracted increasing attention for facial expression analysis as it helps to improve the performance of expression recognition accuracy despite having small number of real training images. However, learning from synthetic face images can be problematic due to the distribution discrepancy between low-quality synthetic images and real face images and may not achieve the desired performance when the learned model applies to real world scenarios. To this end, we propose a new attribute guided face image synthesis to perform a translation between multiple image domains using a single model. In addition, we adopt the proposed model to learn from synthetic faces by matching the feature distributions between different domains while preserving each domain's characteristics. We evaluate the effectiveness of the proposed approach on several face datasets on generating realistic face images. We demonstrate that the expression recognition performance can be enhanced by benefiting from our face synthesis model. Moreover, we also conduct experiments on a near-infrared dataset containing facial expression videos of drivers to assess the performance using in-the-wild data for driver emotion recognition.

* 8 pages, 8 figures, 5 tables, accepted by FG 2019. arXiv admin note: substantial text overlap with arXiv:1905.00286 
Access Paper or Ask Questions