Face synthesis is an important problem in computer vision with many applications. In this work, we describe a new method, namely LandmarkGAN, to synthesize faces based on facial landmarks as input. Facial landmarks are a natural, intuitive, and effective representation for facial expressions and orientations, which are independent from the target's texture or color and background scene. Our method is able to transform a set of facial landmarks into new faces of different subjects, while retains the same facial expression and orientation. Experimental results on face synthesis and reenactments demonstrate the effectiveness of our method.
Sophisticated generative adversary network (GAN) models are now able to synthesize highly realistic human faces that are difficult to discern from real ones visually. In this work, we show that GAN synthesized faces can be exposed with the inconsistent corneal specular highlights between two eyes. The inconsistency is caused by the lack of physical/physiological constraints in the GAN models. We show that such artifacts exist widely in high-quality GAN synthesized faces and further describe an automatic method to extract and compare corneal specular highlights from two eyes. Qualitative and quantitative evaluations of our method suggest its simplicity and effectiveness in distinguishing GAN synthesized faces.
In this paper, we describe a fast and light-weight portrait segmentation method based on a new {\em extremely light-weight backbone} (ELB) architecture. The core element of ELB is a {\em bottleneck-based factorized block} (BFB) that has much fewer parameters than existing alternatives while keeping good learning capacity. Consequently, the ELB-based portrait segmentation method can run faster (263.2FPS) than the existing methods yet retaining the competitive accuracy performance with state-of-the-arts. Experiments conducted on two benchmark datasets demonstrate the effectiveness and efficiency of our method.
Improving the efficiency of portrait segmentation is of great importance for the deployment on mobile devices. In this paper, we achieve the fast and light-weight portrait segmentation by introducing a new extremely light-weight backbone (ELB) architecture. The core element of ELB is a bottleneck-based factorized block (BFB), which can greatly reduce the number of parameters while keeping good learning capacity. Based on the proposed ELB architecture, we only use a single convolution layer as decoder to generate results. The ELB-based portrait segmentation method can run faster (263.2FPS) than existing methods yet retaining the competitive accuracy performance with state-of-the-arts. Experiments are conducted on two datasets, which demonstrates the efficacy of our method.
AI-synthesized face swapping videos, commonly known as the DeepFakes, have become an emerging problem recently. Correspondingly, there is an increasing interest in developing algorithms that can detect them. However, existing dataset of DeepFake videos suffer from low visual quality and abundant artifacts that do not reflect the reality of DeepFake videos circulated on the Internet. In this work, we present a new DeepFake dataset, Celeb-DF, for the development and evaluation of DeepFake detection algorithms. The Celeb-DF dataset is generated using a refined synthesis algorithm that reduces the visual artifacts observed in existing datasets. Based on the Celeb-DF dataset, we also benchmark existing DeepFake detection algorithms.
Recent years have seen fast development in synthesizing realistic human faces using AI technologies. Such fake faces can be weaponized to cause negative personal and social impact. In this work, we develop technologies to defend individuals from becoming victims of recent AI synthesized fake videos by sabotaging would-be training data. This is achieved by disrupting deep neural network (DNN) based face detection method with specially designed imperceptible adversarial perturbations to reduce the quality of the detected faces. We describe attacking schemes under white-box, gray-box and black-box settings, each with decreasing information about the DNN based face detectors. We empirically show the effectiveness of our methods in disrupting state-of-the-art DNN based face detectors on several datasets.
Generative adversary networks (GANs) have recently led to highly realistic image synthesis results. In this work, we describe a new method to expose GAN-synthesized images using the locations of the facial landmark points. Our method is based on the observations that the facial parts configuration generated by GAN models are different from those of the real faces, due to the lack of global constraints. We perform experiments demonstrating this phenomenon, and show that an SVM classifier trained using the locations of facial landmark points is sufficient to achieve good classification performance for GAN-synthesized faces.