Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"facial": models, code, and papers

Identity-Enhanced Network for Facial Expression Recognition

Dec 11, 2018
Yanwei Li, Xingang Wang, Shilei Zhang, Lingxi Xie, Wenqi Wu, Hongyuan Yu, Zheng Zhu

Figure 1 for Identity-Enhanced Network for Facial Expression Recognition

Figure 2 for Identity-Enhanced Network for Facial Expression Recognition

Figure 3 for Identity-Enhanced Network for Facial Expression Recognition

Figure 4 for Identity-Enhanced Network for Facial Expression Recognition

Facial expression recognition is a challenging task, arguably because of large intra-class variations and high inter-class similarities. The core drawback of the existing approaches is the lack of ability to discriminate the changes in appearance caused by emotions and identities. In this paper, we present a novel identity-enhanced network (IDEnNet) to eliminate the negative impact of identity factor and focus on recognizing facial expressions. Spatial fusion combined with self-constrained multi-task learning are adopted to jointly learn the expression representations and identity-related information. We evaluate our approach on three popular datasets, namely Oulu-CASIA, CK+ and MMI. IDEnNet improves the baseline consistently, and achieves the best or comparable state-of-the-art on all three datasets.

Via

Access Paper or Ask Questions

Facial Action Unit Recognition With Multi-models Ensembling

Mar 24, 2022
Wenqiang Jiang, Yannan Wu, Fengsheng Qiao, Liyu Meng, Yuanyuan Deng, Chuanhe Liu

Figure 1 for Facial Action Unit Recognition With Multi-models Ensembling

Figure 2 for Facial Action Unit Recognition With Multi-models Ensembling

Figure 3 for Facial Action Unit Recognition With Multi-models Ensembling

Figure 4 for Facial Action Unit Recognition With Multi-models Ensembling

The Affective Behavior Analysis in-the-wild (ABAW) 2022 Competition gives Affective Computing a large promotion. In this paper, we present our method of AU challenge in this Competition. We use improved IResnet100 as backbone. Then we train AU dataset in Aff-Wild2 on three pertained models pretrained by our private au and expression dataset, and Glint360K respectively. Finally, we ensemble the results of our models. We achieved F1 score (macro) 0.731 on AU validation set.

Via

Access Paper or Ask Questions

Realistic Speech-Driven Facial Animation with GANs

Jun 14, 2019
Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Figure 1 for Realistic Speech-Driven Facial Animation with GANs

Figure 2 for Realistic Speech-Driven Facial Animation with GANs

Figure 3 for Realistic Speech-Driven Facial Animation with GANs

Figure 4 for Realistic Speech-Driven Facial Animation with GANs

Speech-driven facial animation is the process that automatically synthesizes talking characters based on speech signals. The majority of work in this domain creates a mapping from audio features to visual features. This approach often requires post-processing using computer graphics techniques to produce realistic albeit subject dependent results. We present an end-to-end system that generates videos of a talking head, using only a still image of a person and an audio clip containing speech, without relying on handcrafted intermediate features. Our method generates videos which have (a) lip movements that are in sync with the audio and (b) natural facial expressions such as blinks and eyebrow movements. Our temporal GAN uses 3 discriminators focused on achieving detailed frames, audio-visual synchronization, and realistic expressions. We quantify the contribution of each component in our model using an ablation study and we provide insights into the latent representation of the model. The generated videos are evaluated based on sharpness, reconstruction quality, lip-reading accuracy, synchronization as well as their ability to generate natural blinks.

* arXiv admin note: text overlap with arXiv:1805.09313

Via

Access Paper or Ask Questions

Automated Pain Detection from Facial Expressions using FACS: A Review

Nov 13, 2018
Zhanli Chen, Rashid Ansari, Diana Wilkie

Figure 1 for Automated Pain Detection from Facial Expressions using FACS: A Review

Facial pain expression is an important modality for assessing pain, especially when the patient's verbal ability to communicate is impaired. The facial muscle-based action units (AUs), which are defined by the Facial Action Coding System (FACS), have been widely studied and are highly reliable as a method for detecting facial expressions (FE) including valid detection of pain. Unfortunately, FACS coding by humans is a very time-consuming task that makes its clinical use prohibitive. Significant progress on automated facial expression recognition (AFER) has led to its numerous successful applications in FACS-based affective computing problems. However, only a handful of studies have been reported on automated pain detection (APD), and its application in clinical settings is still far from a reality. In this paper, we review the progress in research that has contributed to automated pain detection, with focus on 1) the framework-level similarity between spontaneous AFER and APD problems; 2) the evolution of system design including the recent development of deep learning methods; 3) the strategies and considerations in developing a FACS-based pain detection framework from existing research; and 4) introduction of the most relevant databases that are available for AFER and APD studies. We attempt to present key considerations in extending a general AFER framework to an APD framework in clinical settings. In addition, the performance metrics are also highlighted in evaluating an AFER or an APD system.

Via

Access Paper or Ask Questions

MyStyle: A Personalized Generative Prior

Mar 31, 2022
Yotam Nitzan, Kfir Aberman, Qiurui He, Orly Liba, Michal Yarom, Yossi Gandelsman, Inbar Mosseri, Yael Pritch, Daniel Cohen-or

Figure 1 for MyStyle: A Personalized Generative Prior

Figure 2 for MyStyle: A Personalized Generative Prior

Figure 3 for MyStyle: A Personalized Generative Prior

Figure 4 for MyStyle: A Personalized Generative Prior

We introduce MyStyle, a personalized deep generative prior trained with a few shots of an individual. MyStyle allows to reconstruct, enhance and edit images of a specific person, such that the output is faithful to the person's key facial characteristics. Given a small reference set of portrait images of a person (~100), we tune the weights of a pretrained StyleGAN face generator to form a local, low-dimensional, personalized manifold in the latent space. We show that this manifold constitutes a personalized region that spans latent codes associated with diverse portrait images of the individual. Moreover, we demonstrate that we obtain a personalized generative prior, and propose a unified approach to apply it to various ill-posed image enhancement problems, such as inpainting and super-resolution, as well as semantic editing. Using the personalized generative prior we obtain outputs that exhibit high-fidelity to the input images and are also faithful to the key facial characteristics of the individual in the reference set. We demonstrate our method with fair-use images of numerous widely recognizable individuals for whom we have the prior knowledge for a qualitative evaluation of the expected outcome. We evaluate our approach against few-shots baselines and show that our personalized prior, quantitatively and qualitatively, outperforms state-of-the-art alternatives.

* Project webpage: https://mystyle-personalized-prior.github.io/, Video: https://youtu.be/QvOdQR3tlOc

Via

Access Paper or Ask Questions

Multimodal Engagement Analysis from Facial Videos in the Classroom

Jan 11, 2021
Ömer Sümer, Patricia Goldberg, Sidney D'Mello, Peter Gerjets, Ulrich Trautwein, Enkelejda Kasneci

Figure 1 for Multimodal Engagement Analysis from Facial Videos in the Classroom

Figure 2 for Multimodal Engagement Analysis from Facial Videos in the Classroom

Figure 3 for Multimodal Engagement Analysis from Facial Videos in the Classroom

Figure 4 for Multimodal Engagement Analysis from Facial Videos in the Classroom

Student engagement is a key construct for learning and teaching. While most of the literature explored the student engagement analysis on computer-based settings, this paper extends that focus to classroom instruction. To best examine student visual engagement in the classroom, we conducted a study utilizing the audiovisual recordings of classes at a secondary school over one and a half month's time, acquired continuous engagement labeling per student (N=15) in repeated sessions, and explored computer vision methods to classify engagement levels from faces in the classroom. We trained deep embeddings for attentional and emotional features, training Attention-Net for head pose estimation and Affect-Net for facial expression recognition. We additionally trained different engagement classifiers, consisting of Support Vector Machines, Random Forest, Multilayer Perceptron, and Long Short-Term Memory, for both features. The best performing engagement classifiers achieved AUCs of .620 and .720 in Grades 8 and 12, respectively. We further investigated fusion strategies and found score-level fusion either improves the engagement classifiers or is on par with the best performing modality. We also investigated the effect of personalization and found that using only 60-seconds of person-specific data selected by margin uncertainty of the base classifier yielded an average AUC improvement of .084.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Region Based Extensive Response Index Pattern for Facial Expression Recognition

Nov 26, 2018
Monu Verma, Santosh. K. Vipparthi, Girdhari Singh

Figure 1 for Region Based Extensive Response Index Pattern for Facial Expression Recognition

Figure 2 for Region Based Extensive Response Index Pattern for Facial Expression Recognition

Figure 3 for Region Based Extensive Response Index Pattern for Facial Expression Recognition

Figure 4 for Region Based Extensive Response Index Pattern for Facial Expression Recognition

This paper presents a novel descriptor named Region based Extensive Response Index Pattern (RETRaIN) for facial expression recognition. The RETRaIN encodes the relation among the reference and neighboring pixels of facial active regions. These relations are computed by using directional compass mask on an input image and extract the high edge responses in foremost directions. Further extreme edge index positions are selected and encoded into six-bit compact code to reduce feature dimensionality and distinguish between the uniform and non-uniform patterns in the facial features. The performance of the proposed descriptor is tested and evaluated on three benchmark datasets Extended Cohn Kanade, JAFFE, and MUG. The RETRaIN achieves superior recognition accuracy in comparison to state-of-the-art techniques.

* Conference

Via

Access Paper or Ask Questions

Fast and Effective Adaptation of Facial Action Unit Detection Deep Model

Sep 26, 2019
Mihee Lee, Ognjen, Rudovic, Vladimir Pavlovic, Maja Pantic

Figure 1 for Fast and Effective Adaptation of Facial Action Unit Detection Deep Model

Figure 2 for Fast and Effective Adaptation of Facial Action Unit Detection Deep Model

Figure 3 for Fast and Effective Adaptation of Facial Action Unit Detection Deep Model

Figure 4 for Fast and Effective Adaptation of Facial Action Unit Detection Deep Model

Detecting facial action units (AU) is one of the fundamental steps in automatic recognition of facial expression of emotions and cognitive states. Though there have been a variety of approaches proposed for this task, most of these models are trained only for the specific target AUs, and as such they fail to easily adapt to the task of recognition of new AUs (i.e., those not initially used to train the target models). In this paper, we propose a deep learning approach for facial AU detection that can easily and in a fast manner adapt to a new AU or target subject by leveraging only a few labeled samples from the new task (either an AU or subject). To this end, we propose a modeling approach based on the notion of the model-agnostic meta-learning [C. Finn and Levine, 2017], originally proposed for the general image recognition/detection tasks (e.g., the character recognition from the Omniglot dataset). Specifically, each subject and/or AU is treated as a new learning task and the model learns to adapt based on the knowledge of the previous tasks (the AUs and subjects used to pre-train the target models). Thus, given a new subject or AU, this meta-knowledge (that is shared among training and test tasks) is used to adapt the model to the new task using the notion of deep learning and model-agnostic meta-learning. We show on two benchmark datasets (BP4D and DISFA) for facial AU detection that the proposed approach can be easily adapted to new tasks (AUs/subjects). Using only a few labeled examples from these tasks, the model achieves large improvements over the baselines (i.e., non-adapted models).

* Presented at 2019 IJCAI Affective Computing Workshop

Via

Access Paper or Ask Questions

Identity-Free Facial Expression Recognition using conditional Generative Adversarial Network

Mar 19, 2019
Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O'Reilly, Yan Tong

Figure 1 for Identity-Free Facial Expression Recognition using conditional Generative Adversarial Network

Figure 2 for Identity-Free Facial Expression Recognition using conditional Generative Adversarial Network

Figure 3 for Identity-Free Facial Expression Recognition using conditional Generative Adversarial Network

Figure 4 for Identity-Free Facial Expression Recognition using conditional Generative Adversarial Network

In this paper, we proposed a novel Identity-free conditional Generative Adversarial Network (IF-GAN) to explicitly reduce inter-subject variations for facial expression recognition. Specifically, for any given input face image, a conditional generative model was developed to transform an average neutral face, which is calculated from various subjects showing neutral expressions, to an average expressive face with the same expression as the input image. Since the transformed images have the same synthetic "average" identity, they differ from each other by only their expressions and thus, can be used for identity-free expression classification. In this work, an end-to-end system was developed to perform expression transformation and expression recognition in the IF-GAN framework. Experimental results on three facial expression datasets have demonstrated that the proposed IF-GAN outperforms the baseline CNN model and achieves comparable or better performance compared with the state-of-the-art methods for facial expression recognition.

Via

Access Paper or Ask Questions

Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking

May 06, 2019
Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, King Ngi Ngan

Figure 1 for Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking

Figure 2 for Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking

Figure 3 for Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking

Figure 4 for Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking

In this paper, we propose a generative framework that unifies depth-based 3D facial pose tracking and face model adaptation on-the-fly, in the unconstrained scenarios with heavy occlusions and arbitrary facial expression variations. Specifically, we introduce a statistical 3D morphable model that flexibly describes the distribution of points on the surface of the face model, with an efficient switchable online adaptation that gradually captures the identity of the tracked subject and rapidly constructs a suitable face model when the subject changes. Moreover, unlike prior art that employed ICP-based facial pose estimation, to improve robustness to occlusions, we propose a ray visibility constraint that regularizes the pose based on the face model's visibility with respect to the input point cloud. Ablation studies and experimental results on Biwi and ICT-3DHP datasets demonstrate that the proposed framework is effective and outperforms completing state-of-the-art depth-based methods.

Via

Access Paper or Ask Questions