Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"facial recognition": models, code, and papers

MERANet: Facial Micro-Expression Recognition using 3D Residual Attention Network

Dec 07, 2020
Viswanatha Reddy Gajjala, Sai Prasanna Teja Reddy, Snehasis Mukherjee, Shiv Ram Dubey

We propose a facial micro-expression recognition model using 3D residual attention network called MERANet. The proposed model takes advantage of spatial-temporal attention and channel attention together, to learn deeper fine-grained subtle features for classification of emotions. The proposed model also encompasses both spatial and temporal information simultaneously using the 3D kernels and residual connections. Moreover, the channel features and spatio-temporal features are re-calibrated using the channel and spatio-temporal attentions, respectively in each residual module. The experiments are conducted on benchmark facial micro-expression datasets. A superior performance is observed as compared to the state-of-the-art for facial micro-expression recognition.

Access Paper or Ask Questions

Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues

Aug 14, 2019
Chris Xiaoxuan Lu, Xuan Kan, Bowen Du, Changhao Chen, Hongkai Wen, Andrew Markham, Niki Trigoni, John Stankovic

Facial recognition is a key enabling component for emerging Internet of Things (IoT) services such as smart homes or responsive offices. Through the use of deep neural networks, facial recognition has achieved excellent performance. However, this is only possibly when trained with hundreds of images of each user in different viewing and lighting conditions. Clearly, this level of effort in enrolment and labelling is impossible for wide-spread deployment and adoption. Inspired by the fact that most people carry smart wireless devices with them, e.g. smartphones, we propose to use this wireless identifier as a supervisory label. This allows us to curate a dataset of facial images that are unique to a certain domain e.g. a set of people in a particular office. This custom corpus can then be used to finetune existing pre-trained models e.g. FaceNet. However, due to the vagaries of wireless propagation in buildings, the supervisory labels are noisy and weak.We propose a novel technique, AutoTune, which learns and refines the association between a face and wireless identifier over time, by increasing the inter-cluster separation and minimizing the intra-cluster distance. Through extensive experiments with multiple users on two sites, we demonstrate the ability of AutoTune to design an environment-specific, continually evolving facial recognition system with entirely no user effort.

* 11 pages, accepted in the Web Conference (WWW'2019) 
Access Paper or Ask Questions

LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition

Jan 25, 2021
Valeriia Cherepanova, Micah Goldblum, Harrison Foley, Shiyuan Duan, John Dickerson, Gavin Taylor, Tom Goldstein

Facial recognition systems are increasingly deployed by private corporations, government agencies, and contractors for consumer services and mass surveillance programs alike. These systems are typically built by scraping social media profiles for user images. Adversarial perturbations have been proposed for bypassing facial recognition systems. However, existing methods fail on full-scale systems and commercial APIs. We develop our own adversarial filter that accounts for the entire image processing pipeline and is demonstrably effective against industrial-grade pipelines that include face detection and large scale databases. Additionally, we release an easy-to-use webtool that significantly degrades the accuracy of Amazon Rekognition and the Microsoft Azure Face Recognition API, reducing the accuracy of each to below 1%.

* Published as a conference paper at ICLR 2021 
Access Paper or Ask Questions

Understanding and Mitigating Annotation Bias in Facial Expression Recognition

Aug 19, 2021
Yunliang Chen, Jungseock Joo

The performance of a computer vision model depends on the size and quality of its training data. Recent studies have unveiled previously-unknown composition biases in common image datasets which then lead to skewed model outputs, and have proposed methods to mitigate these biases. However, most existing works assume that human-generated annotations can be considered gold-standard and unbiased. In this paper, we reveal that this assumption can be problematic, and that special care should be taken to prevent models from learning such annotation biases. We focus on facial expression recognition and compare the label biases between lab-controlled and in-the-wild datasets. We demonstrate that many expression datasets contain significant annotation biases between genders, especially when it comes to the happy and angry expressions, and that traditional methods cannot fully mitigate such biases in trained models. To remove expression annotation bias, we propose an AU-Calibrated Facial Expression Recognition (AUC-FER) framework that utilizes facial action units (AUs) and incorporates the triplet loss into the objective function. Experimental results suggest that the proposed method is more effective in removing expression annotation bias than existing techniques.

* To appear in ICCV 2021 
Access Paper or Ask Questions

Impact of multiple modalities on emotion recognition: investigation into 3d facial landmarks, action units, and physiological data

May 17, 2020
Diego Fabiano, Manikandan Jaishanker, Shaun Canavan

To fully understand the complexities of human emotion, the integration of multiple physical features from different modalities can be advantageous. Considering this, we present an analysis of 3D facial data, action units, and physiological data as it relates to their impact on emotion recognition. We analyze each modality independently, as well as the fusion of each for recognizing human emotion. This analysis includes which features are most important for specific emotions (e.g. happy). Our analysis indicates that both 3D facial landmarks and physiological data are encouraging for expression/emotion recognition. On the other hand, while action units can positively impact emotion recognition when fused with other modalities, the results suggest it is difficult to detect emotion using them in a unimodal fashion.

Access Paper or Ask Questions

Learning Vision Transformer with Squeeze and Excitation for Facial Expression Recognition

Jul 16, 2021
Mouath Aouayeb, Wassim Hamidouche, Catherine Soladie, Kidiyo Kpalma, Renaud Seguier

As various databases of facial expressions have been made accessible over the last few decades, the Facial Expression Recognition (FER) task has gotten a lot of interest. The multiple sources of the available databases raised several challenges for facial recognition task. These challenges are usually addressed by Convolution Neural Network (CNN) architectures. Different from CNN models, a Transformer model based on attention mechanism has been presented recently to address vision tasks. One of the major issue with Transformers is the need of a large data for training, while most FER databases are limited compared to other vision applications. Therefore, we propose in this paper to learn a vision Transformer jointly with a Squeeze and Excitation (SE) block for FER task. The proposed method is evaluated on different publicly available FER databases including CK+, JAFFE,RAF-DB and SFEW. Experiments demonstrate that our model outperforms state-of-the-art methods on CK+ and SFEW and achieves competitive results on JAFFE and RAF-DB.

Access Paper or Ask Questions

VGAN-Based Image Representation Learning for Privacy-Preserving Facial Expression Recognition

Sep 07, 2018
Jiawei Chen, Janusz Konrad, Prakash Ishwar

Reliable facial expression recognition plays a critical role in human-machine interactions. However, most of the facial expression analysis methodologies proposed to date pay little or no attention to the protection of a user's privacy. In this paper, we propose a Privacy-Preserving Representation-Learning Variational Generative Adversarial Network (PPRL-VGAN) to learn an image representation that is explicitly disentangled from the identity information. At the same time, this representation is discriminative from the standpoint of facial expression recognition and generative as it allows expression-equivalent face image synthesis. We evaluate the proposed model on two public datasets under various threat scenarios. Quantitative and qualitative results demonstrate that our approach strikes a balance between the preservation of privacy and data utility. We further demonstrate that our model can be effectively applied to other tasks such as expression morphing and image completion.

Access Paper or Ask Questions

Semi-Supervised Self-Growing Generative Adversarial Networks for Image Recognition

Aug 11, 2019
Haoqian Wang, Zhiwei Xu, Jun Xu, Wangpeng An, Lei Zhang, Qionghai Dai

Image recognition is an important topic in computer vision and image processing, and has been mainly addressed by supervised deep learning methods, which need a large set of labeled images to achieve promising performance. However, in most cases, labeled data are expensive or even impossible to obtain, while unlabeled data are readily available from numerous free on-line resources and have been exploited to improve the performance of deep neural networks. To better exploit the power of unlabeled data for image recognition, in this paper, we propose a semi-supervised and generative approach, namely the semi-supervised self-growing generative adversarial network (SGGAN). Label inference is a key step for the success of semi-supervised learning approaches. There are two main problems in label inference: how to measure the confidence of the unlabeled data and how to generalize the classifier. We address these two problems via the generative framework and a novel convolution-block-transformation technique, respectively. To stabilize and speed up the training process of SGGAN, we employ the metric Maximum Mean Discrepancy as the feature matching objective function and achieve larger gain than the standard semi-supervised GANs (SSGANs), narrowing the gap to the supervised methods. Experiments on several benchmark datasets show the effectiveness of the proposed SGGAN on image recognition and facial attribute recognition tasks. By using the training data with only 4% labeled facial attributes, the SGGAN approach can achieve comparable accuracy with leading supervised deep learning methods with all labeled facial attributes.

* 13 pages, 11 figures, 8 tables. arXiv admin note: text overlap with arXiv:1606.03498 by other authors 
Access Paper or Ask Questions