Automatic facial expression recognition is an important research area in the emotion recognition and computer vision. Applications can be found in several domains such as medical treatment, driver fatigue surveillance, sociable robotics, and several other human-computer interaction systems. Therefore, it is crucial that the machine should be able to recognize the emotional state of the user with high accuracy. In recent years, deep neural networks have been used with great success in recognizing emotions. In this paper, we present a new model for continuous emotion recognition based on facial expression recognition by using an unsupervised learning approach based on transfer learning and autoencoders. The proposed approach also includes preprocessing and post-processing techniques which contribute favorably to improving the performance of predicting the concordance correlation coefficient for arousal and valence dimensions. Experimental results for predicting spontaneous and natural emotions on the RECOLA 2016 dataset have shown that the proposed approach based on visual information can achieve CCCs of 0.516 and 0.264 for valence and arousal, respectively.
Most of the existing deep neural nets on automatic facial expression recognition focus on a set of predefined emotion classes, where the amount of training data has the biggest impact on performance. However, in the standard setting over-parameterised neural networks are not amenable for learning from few samples as they can quickly over-fit. In addition, these approaches do not have such a strong generalisation ability to identify a new category, where the data of each category is too limited and significant variations exist in the expression within the same semantic category. We embrace these challenges and formulate the problem as a low-shot learning, where once the base classifier is deployed, it must rapidly adapt to recognise novel classes using a few samples. In this paper, we revisit and compare existing few-shot learning methods for the low-shot facial expression recognition in terms of their generalisation ability via episode-training. In particular, we extend our analysis on the cross-domain generalisation, where training and test tasks are not drawn from the same distribution. We demonstrate the efficacy of low-shot learning methods through extensive experiments.
VIS-NIR face recognition remains a challenging task due to the distinction between spectral components of two modalities and insufficient paired training data. Inspired by the CycleGAN, this paper presents a method aiming to translate VIS face images into fake NIR images whose distributions are intended to approximate those of true NIR images, which is achieved by proposing a new facial feature embedded CycleGAN. Firstly, to learn the particular feature of NIR domain while preserving common facial representation between VIS and NIR domains, we employ a general facial feature extractor (FFE) to replace the encoder in the original generator of CycleGAN. For implementing the facial feature extractor, herein the MobileFaceNet is pretrained on a VIS face database, and is able to extract effective features. Secondly, the domain-invariant feature learning is enhanced by considering a new pixel consistency loss. Lastly, we establish a new WHU VIS-NIR database which varies in face rotation and expressions to enrich the training data. Experimental results on the Oulu-CASIA NIR-VIS database and the WHU VIS-NIR database show that the proposed FFE-based CycleGAN (FFE-CycleGAN) outperforms state-of-the-art VIS-NIR face recognition methods and achieves 96.5\% accuracy.
In this paper, we present a novel Gabor wavelet based Kernel Entropy Component Analysis (KECA) method by integrating the Gabor wavelet transformation (GWT) of facial images with the KECA method for enhanced face recognition performance. Firstly, from the Gabor wavelet transformed images the most important discriminative desirable facial features characterized by spatial frequency, spatial locality and orientation selectivity to cope with the variations due to illumination and facial expression changes were derived. After that KECA, relating to the Renyi entropy is extended to include cosine kernel function. The KECA with the cosine kernels is then applied on the extracted most important discriminating feature vectors of facial images to obtain only those real kernel ECA eigenvectors that are associated with eigenvalues having positive entropy contribution. Finally, these real KECA features are used for image classification using the L1, L2 distance measures; the Mahalanobis distance measure and the cosine similarity measure. The feasibility of the Gabor based KECA method with the cosine kernel has been successfully tested on both frontal and pose-angled face recognition, using datasets from the ORL, FRAV2D and the FERET database.
Engagement is a key indicator of the quality of learning experience, and one that plays a major role in developing intelligent educational interfaces. Any such interface requires the ability to recognise the level of engagement in order to respond appropriately; however, there is very little existing data to learn from, and new data is expensive and difficult to acquire. This paper presents a deep learning model to improve engagement recognition from face images captured `in the wild' that overcomes the data sparsity challenge by pre-training on readily available basic facial expression data, before training on specialised engagement data. In the first of two steps, a state-of-the-art facial expression recognition model is trained to provide a rich face representation using deep learning. In the second step, we use the model's weights to initialize our deep learning based model to recognize engagement; we term this the Transfer model. We train the model on our new engagement recognition (ER) dataset with 4627 engaged and disengaged samples. We find that our Transfer architecture outperforms standard deep learning architectures that we apply for the first time to engagement recognition, as well as approaches using HOG features and SVMs. The model achieves a classification accuracy of 72.38%, which is 6.1% better than the best baseline model on the test set of the ER dataset. Using the F1 measure and the area under the ROC curve, our Transfer model achieves 73.90% and 73.74%, exceeding the best baseline model by 3.49% and 5.33% respectively.
Current research on soft-biometrics showed that privacy-sensitive information can be deduced from biometric templates of an individual. Since for many applications, these templates are expected to be used for recognition purposes only, this raises major privacy issues. Previous works focused on supervised privacy-enhancing solutions that require privacy-sensitive information about individuals and limit their application to the suppression of single and pre-defined attributes. Consequently, they do not take into account attributes that are not considered in the training. In this work, we present Negative Face Recognition (NFR), a novel face recognition approach that enhances the soft-biometric privacy on the template-level by representing face templates in a complementary (negative) domain. While ordinary templates characterize facial properties of an individual, negative templates describe facial properties that does not exist for this individual. This suppresses privacy-sensitive information from stored templates. Experiments are conducted on two publicly available datasets captured under controlled and uncontrolled scenarios on three privacy-sensitive attributes. The experiments demonstrate that our proposed approach reaches higher suppression rates than previous work, while maintaining higher recognition performances as well. Unlike previous works, our approach does not require privacy-sensitive labels and offers a more comprehensive privacy-protection not limited to pre-defined attributes.
Faces are highly deformable objects which may easily change their appearance over time. Not all face areas are subject to the same variability. Therefore decoupling the information from independent areas of the face is of paramount importance to improve the robustness of any face recognition technique. This paper presents a robust face recognition technique based on the extraction and matching of SIFT features related to independent face areas. Both a global and local (as recognition from parts) matching strategy is proposed. The local strategy is based on matching individual salient facial SIFT features as connected to facial landmarks such as the eyes and the mouth. As for the global matching strategy, all SIFT features are combined together to form a single feature. In order to reduce the identification errors, the Dempster-Shafer decision theory is applied to fuse the two matching techniques. The proposed algorithms are evaluated with the ORL and the IITK face databases. The experimental results demonstrate the effectiveness and potential of the proposed face recognition techniques also in the case of partially occluded faces or with missing information.
Human face recognition is one of the most important research areas in biometrics. However, the robust face recognition under a drastic change of the facial pose, expression, and illumination is a big challenging problem for its practical application. Such variations make face recognition more difficult. In this paper, we propose a novel face recognition method, called Attentional Feature-pair Relation Network (AFRN), which represents the face by the relevant pairs of local appearance block features with their attention scores. The AFRN represents the face by all possible pairs of the 9x9 local appearance block features, the importance of each pair is considered by the attention map that is obtained from the low-rank bilinear pooling, and each pair is weighted by its corresponding attention score. To increase the accuracy, we select top-K pairs of local appearance block features as relevant facial information and drop the remaining irrelevant. The weighted top-K pairs are propagated to extract the joint feature-pair relation by using bilinear attention network. In experiments, we show the effectiveness of the proposed AFRN and achieve the outstanding performance in the 1:1 face verification and 1:N face identification tasks compared to existing state-of-the-art methods on the challenging LFW, YTF, CALFW, CPLFW, CFP, AgeDB, IJB-A, IJB-B, and IJB-C datasets.
Despite their continued popularity, categorical approaches to affect recognition have limitations, especially in real-life situations. Dimensional models of affect offer important advantages for the recognition of subtle expressions and more fine-grained analysis. We introduce a simple but effective facial expression analysis (FEA) system for dimensional affect, solely based on geometric features and Partial Least Squares (PLS) regression. The system jointly learns to estimate Arousal and Valence ratings from a set of facial images. The proposed approach is robust, efficient, and exhibits comparable performance to contemporary deep learning models, while requiring a fraction of the computational resources.
Face recognition is a popular and well-studied area with wide applications in our society. However, racial bias had been proven to be inherent in most State Of The Art (SOTA) face recognition systems. Many investigative studies on face recognition algorithms have reported higher false positive rates of African subjects cohorts than the other cohorts. Lack of large-scale African face image databases in public domain is one of the main restrictions in studying the racial bias problem of face recognition. To this end, we collect a face image database namely CASIA-Face-Africa which contains 38,546 images of 1,183 African subjects. Multi-spectral cameras are utilized to capture the face images under various illumination settings. Demographic attributes and facial expressions of the subjects are also carefully recorded. For landmark detection, each face image in the database is manually labeled with 68 facial keypoints. A group of evaluation protocols are constructed according to different applications, tasks, partitions and scenarios. The performances of SOTA face recognition algorithms without re-training are reported as baselines. The proposed database along with its face landmark annotations, evaluation protocols and preliminary results form a good benchmark to study the essential aspects of face biometrics for African subjects, especially face image preprocessing, face feature analysis and matching, facial expression recognition, sex/age estimation, ethnic classification, face image generation, etc. The database can be downloaded from our http://www.cripacsir.cn/dataset/