Alert button
Picture for Haotian Yang

Haotian Yang

Alert button

Towards Practical Capture of High-Fidelity Relightable Avatars

Sep 08, 2023
Haotian Yang, Mingwu Zheng, Wanquan Feng, Haibin Huang, Yu-Kun Lai, Pengfei Wan, Zhongyuan Wang, Chongyang Ma

Figure 1 for Towards Practical Capture of High-Fidelity Relightable Avatars
Figure 2 for Towards Practical Capture of High-Fidelity Relightable Avatars
Figure 3 for Towards Practical Capture of High-Fidelity Relightable Avatars
Figure 4 for Towards Practical Capture of High-Fidelity Relightable Avatars

In this paper, we propose a novel framework, Tracking-free Relightable Avatar (TRAvatar), for capturing and reconstructing high-fidelity 3D avatars. Compared to previous methods, TRAvatar works in a more practical and efficient setting. Specifically, TRAvatar is trained with dynamic image sequences captured in a Light Stage under varying lighting conditions, enabling realistic relighting and real-time animation for avatars in diverse scenes. Additionally, TRAvatar allows for tracking-free avatar capture and obviates the need for accurate surface tracking under varying illumination conditions. Our contributions are two-fold: First, we propose a novel network architecture that explicitly builds on and ensures the satisfaction of the linear nature of lighting. Trained on simple group light captures, TRAvatar can predict the appearance in real-time with a single forward pass, achieving high-quality relighting effects under illuminations of arbitrary environment maps. Second, we jointly optimize the facial geometry and relightable appearance from scratch based on image sequences, where the tracking is implicitly learned. This tracking-free approach brings robustness for establishing temporal correspondences between frames under different lighting conditions. Extensive qualitative and quantitative experiments demonstrate that our framework achieves superior performance for photorealistic avatar animation and relighting.

* Accepted to SIGGRAPH Asia 2023 (Conference); Project page: https://travatar-paper.github.io/ 
Viaarxiv icon

Detailed Facial Geometry Recovery from Multi-view Images by Learning an Implicit Function

Jan 04, 2022
Yunze Xiao, Hao Zhu, Haotian Yang, Zhengyu Diao, Xiangju Lu, Xun Cao

Figure 1 for Detailed Facial Geometry Recovery from Multi-view Images by Learning an Implicit Function
Figure 2 for Detailed Facial Geometry Recovery from Multi-view Images by Learning an Implicit Function
Figure 3 for Detailed Facial Geometry Recovery from Multi-view Images by Learning an Implicit Function
Figure 4 for Detailed Facial Geometry Recovery from Multi-view Images by Learning an Implicit Function

Recovering detailed facial geometry from a set of calibrated multi-view images is valuable for its wide range of applications. Traditional multi-view stereo (MVS) methods adopt optimization methods to regularize the matching cost. Recently, learning-based methods integrate all these into an end-to-end neural network and show superiority of efficiency. In this paper, we propose a novel architecture to recover extremely detailed 3D faces in roughly 10 seconds. Unlike previous learning-based methods that regularize the cost volume via 3D CNN, we propose to learn an implicit function for regressing the matching cost. By fitting a 3D morphable model from multi-view images, the features of multiple images are extracted and aggregated in the mesh-attached UV space, which makes the implicit function more effective in recovering detailed facial shape. Our method outperforms SOTA learning-based MVS in accuracy by a large margin on the FaceScape dataset. The code and data will be released soon.

* accepted to AAAI2022 
Viaarxiv icon

FaceScape: 3D Facial Dataset and Benchmark for Single-View 3D Face Reconstruction

Nov 01, 2021
Hao Zhu, Haotian Yang, Longwei Guo, Yidi Zhang, Yanru Wang, Mingkai Huang, Qiu Shen, Ruigang Yang, Xun Cao

Figure 1 for FaceScape: 3D Facial Dataset and Benchmark for Single-View 3D Face Reconstruction
Figure 2 for FaceScape: 3D Facial Dataset and Benchmark for Single-View 3D Face Reconstruction
Figure 3 for FaceScape: 3D Facial Dataset and Benchmark for Single-View 3D Face Reconstruction
Figure 4 for FaceScape: 3D Facial Dataset and Benchmark for Single-View 3D Face Reconstruction

In this paper, we present a large-scale detailed 3D face dataset, FaceScape, and the corresponding benchmark to evaluate single-view facial 3D reconstruction. By training on FaceScape data, a novel algorithm is proposed to predict elaborate riggable 3D face models from a single image input. FaceScape dataset provides 18,760 textured 3D faces, captured from 938 subjects and each with 20 specific expressions. The 3D models contain the pore-level facial geometry that is also processed to be topologically uniformed. These fine 3D facial models can be represented as a 3D morphable model for rough shapes and displacement maps for detailed geometry. Taking advantage of the large-scale and high-accuracy dataset, a novel algorithm is further proposed to learn the expression-specific dynamic details using a deep neural network. The learned relationship serves as the foundation of our 3D face prediction system from a single image input. Different than the previous methods, our predicted 3D models are riggable with highly detailed geometry under different expressions. We also use FaceScape data to generate the in-the-wild and in-the-lab benchmark to evaluate recent methods of single-view face reconstruction. The accuracy is reported and analyzed on the dimensions of camera pose and focal length, which provides a faithful and comprehensive evaluation and reveals new challenges. The unprecedented dataset, benchmark, and code have been released to the public for research purpose.

* 14 pages, 13 figures, journal extension of FaceScape(CVPR 2020). arXiv admin note: substantial text overlap with arXiv:2003.13989 
Viaarxiv icon

Detailed Avatar Recovery from Single Image

Aug 06, 2021
Hao Zhu, Xinxin Zuo, Haotian Yang, Sen Wang, Xun Cao, Ruigang Yang

Figure 1 for Detailed Avatar Recovery from Single Image
Figure 2 for Detailed Avatar Recovery from Single Image
Figure 3 for Detailed Avatar Recovery from Single Image
Figure 4 for Detailed Avatar Recovery from Single Image

This paper presents a novel framework to recover \emph{detailed} avatar from a single image. It is a challenging task due to factors such as variations in human shapes, body poses, texture, and viewpoints. Prior methods typically attempt to recover the human body shape using a parametric-based template that lacks the surface details. As such resulting body shape appears to be without clothing. In this paper, we propose a novel learning-based framework that combines the robustness of the parametric model with the flexibility of free-form 3D deformation. We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation (HMD) framework, utilizing the constraints from body joints, silhouettes, and per-pixel shading information. Our method can restore detailed human body shapes with complete textures beyond skinned models. Experiments demonstrate that our method has outperformed previous state-of-the-art approaches, achieving better accuracy in terms of both 2D IoU number and 3D metric distance.

* Accepted by TPAMI 
Viaarxiv icon

FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction

Apr 21, 2020
Haotian Yang, Hao Zhu, Yanru Wang, Mingkai Huang, Qiu Shen, Ruigang Yang, Xun Cao

Figure 1 for FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
Figure 2 for FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
Figure 3 for FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
Figure 4 for FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction

In this paper, we present a large-scale detailed 3D face dataset, FaceScape, and propose a novel algorithm that is able to predict elaborate riggable 3D face models from a single image input. FaceScape dataset provides 18,760 textured 3D faces, captured from 938 subjects and each with 20 specific expressions. The 3D models contain the pore-level facial geometry that is also processed to be topologically uniformed. These fine 3D facial models can be represented as a 3D morphable model for rough shapes and displacement maps for detailed geometry. Taking advantage of the large-scale and high-accuracy dataset, a novel algorithm is further proposed to learn the expression-specific dynamic details using a deep neural network. The learned relationship serves as the foundation of our 3D face prediction system from a single image input. Different than the previous methods, our predicted 3D models are riggable with highly detailed geometry under different expressions. The unprecedented dataset and code will be released to public for research purpose.

* Accepted to CVPR 2020 
Viaarxiv icon