Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"facial recognition": models, code, and papers

Survey on Emotional Body Gesture Recognition

Jan 23, 2018
Fatemeh Noroozi, Ciprian Adrian Corneanu, Dorota Kamińska, Tomasz Sapiński, Sergio Escalera, Gholamreza Anbarjafari

Automatic emotion recognition has become a trending research topic in the past decade. While works based on facial expressions or speech abound, recognizing affect from body gestures remains a less explored topic. We present a new comprehensive survey hoping to boost research in the field. We first introduce emotional body gestures as a component of what is commonly known as "body language" and comment general aspects as gender differences and culture dependence. We then define a complete framework for automatic emotional body gesture recognition. We introduce person detection and comment static and dynamic body pose estimation methods both in RGB and 3D. We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures. We also discuss multi-modal approaches that combine speech or face with body gestures for improved emotion recognition. While pre-processing methodologies (e.g. human detection and pose estimation) are nowadays mature technologies fully developed for robust large scale analysis, we show that for emotion recognition the quantity of labelled data is scarce, there is no agreement on clearly defined output spaces and the representations are shallow and largely based on naive geometrical representations.

Access Paper or Ask Questions

Using Self-Supervised Co-Training to Improve Facial Representation

May 13, 2021
Mahdi Pourmirzaei, Farzaneh Esmaili, Gholam Ali Montazer

In this paper, at first, the impact of ImageNet pre-training on Facial Expression Recognition (FER) was tested under different augmentation levels. It could be seen from the results that training from scratch could reach better performance compared to ImageNet fine-tuning at stronger augmentation levels. After that, a framework was proposed for standard Supervised Learning (SL), called Hybrid Learning (HL) which used Self-Supervised co-training with SL in Multi-Task Learning (MTL) manner. Leveraging Self-Supervised Learning (SSL) could gain additional information from input data like spatial information from faces which helped the main SL task. It is been investigated how this method could be used for FER problems with self-supervised pre-tasks such as Jigsaw puzzling and in-painting. The supervised head (SH) was helped by these two methods to lower the error rate under different augmentations and low data regime in the same training settings. The state-of-the-art was reached on AffectNet via two completely different HL methods, without utilizing additional datasets. Moreover, HL's effect was shown on two different facial-related problem, head poses estimation and gender recognition, which concluded to reduce in error rate by up to 9% and 1% respectively. Also, we saw that the HL methods prevented the model from reaching overfitting.

Access Paper or Ask Questions

Context Based Emotion Recognition using EMOTIC Dataset

Mar 30, 2020
Ronak Kosti, Jose M. Alvarez, Adria Recasens, Agata Lapedriza

In our everyday lives and social interactions we often try to perceive the emotional states of people. There has been a lot of research in providing machines with a similar capacity of recognizing emotions. From a computer vision perspective, most of the previous efforts have been focusing in analyzing the facial expressions and, in some cases, also the body pose. Some of these methods work remarkably well in specific settings. However, their performance is limited in natural, unconstrained environments. Psychological studies show that the scene context, in addition to facial expression and body pose, provides important information to our perception of people's emotions. However, the processing of the context for automatic emotion recognition has not been explored in depth, partly due to the lack of proper data. In this paper we present EMOTIC, a dataset of images of people in a diverse set of natural situations, annotated with their apparent emotion. The EMOTIC dataset combines two different types of emotion representation: (1) a set of 26 discrete categories, and (2) the continuous dimensions Valence, Arousal, and Dominance. We also present a detailed statistical and algorithmic analysis of the dataset along with annotators' agreement analysis. Using the EMOTIC dataset we train different CNN models for emotion recognition, combining the information of the bounding box containing the person with the contextual information extracted from the scene. Our results show how scene context provides important information to automatically recognize emotional states and motivate further research in this direction. Dataset and code is open-sourced and available at: and link for the peer-reviewed published article:

Access Paper or Ask Questions

The Use of AI for Thermal Emotion Recognition: A Review of Problems and Limitations in Standard Design and Data

Sep 22, 2020
Catherine Ordun, Edward Raff, Sanjay Purushotham

With the increased attention on thermal imagery for Covid-19 screening, the public sector may believe there are new opportunities to exploit thermal as a modality for computer vision and AI. Thermal physiology research has been ongoing since the late nineties. This research lies at the intersections of medicine, psychology, machine learning, optics, and affective computing. We will review the known factors of thermal vs. RGB imaging for facial emotion recognition. But we also propose that thermal imagery may provide a semi-anonymous modality for computer vision, over RGB, which has been plagued by misuse in facial recognition. However, the transition to adopting thermal imagery as a source for any human-centered AI task is not easy and relies on the availability of high fidelity data sources across multiple demographics and thorough validation. This paper takes the reader on a short review of machine learning in thermal FER and the limitations of collecting and developing thermal FER data for AI training. Our motivation is to provide an introductory overview into recent advances for thermal FER and stimulate conversation about the limitations in current datasets.

* Presented at AAAI FSS-20: Artificial Intelligence in Government and Public Sector, Washington, DC, USA 
Access Paper or Ask Questions

A Comprehensive Analysis of Deep Learning Based Representation for Face Recognition

Jun 09, 2016
Mostafa Mehdipour Ghazi, Hazim Kemal Ekenel

Deep learning based approaches have been dominating the face recognition field due to the significant performance improvement they have provided on the challenging wild datasets. These approaches have been extensively tested on such unconstrained datasets, on the Labeled Faces in the Wild and YouTube Faces, to name a few. However, their capability to handle individual appearance variations caused by factors such as head pose, illumination, occlusion, and misalignment has not been thoroughly assessed till now. In this paper, we present a comprehensive study to evaluate the performance of deep learning based face representation under several conditions including the varying head pose angles, upper and lower face occlusion, changing illumination of different strengths, and misalignment due to erroneous facial feature localization. Two successful and publicly available deep learning models, namely VGG-Face and Lightened CNN have been utilized to extract face representations. The obtained results show that although deep learning provides a powerful representation for face recognition, it can still benefit from preprocessing, for example, for pose and illumination normalization in order to achieve better performance under various conditions. Particularly, if these variations are not included in the dataset used to train the deep learning model, the role of preprocessing becomes more crucial. Experimental results also show that deep learning based representation is robust to misalignment and can tolerate facial feature localization errors up to 10% of the interocular distance.

Access Paper or Ask Questions

Meta Transfer Learning for Emotion Recognition

Jun 23, 2020
Dung Nguyen, Sridha Sridharan, Duc Thanh Nguyen, Simon Denman, David Dean, Clinton Fookes

Deep learning has been widely adopted in automatic emotion recognition and has lead to significant progress in the field. However, due to insufficient annotated emotion datasets, pre-trained models are limited in their generalization capability and thus lead to poor performance on novel test sets. To mitigate this challenge, transfer learning performing fine-tuning on pre-trained models has been applied. However, the fine-tuned knowledge may overwrite and/or discard important knowledge learned from pre-trained models. In this paper, we address this issue by proposing a PathNet-based transfer learning method that is able to transfer emotional knowledge learned from one visual/audio emotion domain to another visual/audio emotion domain, and transfer the emotional knowledge learned from multiple audio emotion domains into one another to improve overall emotion recognition accuracy. To show the robustness of our proposed system, various sets of experiments for facial expression recognition and speech emotion recognition task on three emotion datasets: SAVEE, EMODB, and eNTERFACE have been carried out. The experimental results indicate that our proposed system is capable of improving the performance of emotion recognition, making its performance substantially superior to the recent proposed fine-tuning/pre-trained models based transfer learning methods.

* Revision under Journal of Pattern Recognition 
Access Paper or Ask Questions

An Efficient Method for Face Recognition System In Various Assorted Conditions

Mar 04, 2014
V. Karthikeyan, K. Vijayalakshmi, P. Jeyakumar

In the beginning stage, face verification is done using easy method of geometric algorithm models, but the verification route has now developed into a scientific progress of complicated geometric representation and identical procedure. In recent years the technologies have boosted face recognition system into the healthy focus. Researchers currently undergoing strong research on finding face recognition system for wider area information taken under hysterical elucidation dissimilarity. The proposed face recognition system consists of a narrative expositionindiscreet preprocessing method, a hybrid Fourier-based facial feature extraction and a score fusion scheme. We have verified the face recognition in different lightening conditions (day or night) and at different locations (indoor or outdoor). Preprocessing, Image detection, Feature- extraction and Face recognition are the methods used for face verification system. This paper focuses mainly on the issue of toughness to lighting variations. The proposed system has obtained an average of 88.1% verification rate on Two-Dimensional images under different lightening conditions.

* 9 figures and 5 pages. arXiv admin note: substantial text overlap with arXiv:1401.6108 
Access Paper or Ask Questions

Training with the Invisibles: Obfuscating Images to Share Safely for Learning Visual Recognition Models

Jan 01, 2019
Tae-hoon Kim, Dongmin Kang, Kari Pulli, Jonghyun Choi

High-performance visual recognition systems generally require a large collection of labeled images to train. The expensive data curation can be an obstacle for improving recognition performance. Sharing more data allows training for better models. But personal and private information in the data prevent such sharing. To promote sharing visual data for learning a recognition model, we propose to obfuscate the images so that humans are not able to recognize their detailed contents, while machines can still utilize them to train new models. We validate our approach by comprehensive experiments on three challenging visual recognition tasks; image classification, attribute classification, and facial landmark detection on several datasets including SVHN, CIFAR10, Pascal VOC 2012, CelebA, and MTFL. Our method successfully obfuscates the images from humans recognition, but a machine model trained with them performs within about 1% margin (up to 0.48%) of the performance of a model trained with the original, non-obfuscated data.

Access Paper or Ask Questions

Robust Face Recognition by Constrained Part-based Alignment

Jan 20, 2015
Yuting Zhang, Kui Jia, Yueming Wang, Gang Pan, Tsung-Han Chan, Yi Ma

Developing a reliable and practical face recognition system is a long-standing goal in computer vision research. Existing literature suggests that pixel-wise face alignment is the key to achieve high-accuracy face recognition. By assuming a human face as piece-wise planar surfaces, where each surface corresponds to a facial part, we develop in this paper a Constrained Part-based Alignment (CPA) algorithm for face recognition across pose and/or expression. Our proposed algorithm is based on a trainable CPA model, which learns appearance evidence of individual parts and a tree-structured shape configuration among different parts. Given a probe face, CPA simultaneously aligns all its parts by fitting them to the appearance evidence with consideration of the constraint from the tree-structured shape configuration. This objective is formulated as a norm minimization problem regularized by graph likelihoods. CPA can be easily integrated with many existing classifiers to perform part-based face recognition. Extensive experiments on benchmark face datasets show that CPA outperforms or is on par with existing methods for robust face recognition across pose, expression, and/or illumination changes.

Access Paper or Ask Questions

Vesselness features and the inverse compositional AAM for robust face recognition using thermal IR

Jun 07, 2013
Reza Shoja Ghiass, Ognjen Arandjelovic, Hakim Bendada, Xavier Maldague

Over the course of the last decade, infrared (IR) and particularly thermal IR imaging based face recognition has emerged as a promising complement to conventional, visible spectrum based approaches which continue to struggle when applied in the real world. While inherently insensitive to visible spectrum illumination changes, IR images introduce specific challenges of their own, most notably sensitivity to factors which affect facial heat emission patterns, e.g. emotional state, ambient temperature, and alcohol intake. In addition, facial expression and pose changes are more difficult to correct in IR images because they are less rich in high frequency detail which is an important cue for fitting any deformable model. We describe a novel method which addresses these challenges. To normalize for pose and facial expression changes we generate a synthetic frontal image of a face in a canonical, neutral facial expression from an image of the face in an arbitrary pose and facial expression. This is achieved by piecewise affine warping which follows active appearance model (AAM) fitting. This is the first publication which explores the use of an AAM on thermal IR images; we propose a pre-processing step which enhances detail in thermal images, making AAM convergence faster and more accurate. To overcome the problem of thermal IR image sensitivity to the pattern of facial temperature emissions we describe a representation based on reliable anatomical features. In contrast to previous approaches, our representation is not binary; rather, our method accounts for the reliability of the extracted features. This makes the proposed representation much more robust both to pose and scale changes. The effectiveness of the proposed approach is demonstrated on the largest public database of thermal IR images of faces on which it achieved 100% identification, significantly outperforming previous methods.

* AAAI Conference on Artificial Intelligence, 2013 
Access Paper or Ask Questions