Alert button
Picture for Liao Wang

Liao Wang

Alert button

Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos

Apr 10, 2023
Liao Wang, Qiang Hu, Qihan He, Ziyu Wang, Jingyi Yu, Tinne Tuytelaars, Lan Xu, Minye Wu

Figure 1 for Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
Figure 2 for Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
Figure 3 for Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
Figure 4 for Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos

The success of the Neural Radiance Fields (NeRFs) for modeling and free-view rendering static objects has inspired numerous attempts on dynamic scenes. Current techniques that utilize neural rendering for facilitating free-view videos (FVVs) are restricted to either offline rendering or are capable of processing only brief sequences with minimal motion. In this paper, we present a novel technique, Residual Radiance Field or ReRF, as a highly compact neural representation to achieve real-time FVV rendering on long-duration dynamic scenes. ReRF explicitly models the residual information between adjacent timestamps in the spatial-temporal feature space, with a global coordinate-based tiny MLP as the feature decoder. Specifically, ReRF employs a compact motion grid along with a residual feature grid to exploit inter-frame feature similarities. We show such a strategy can handle large motions without sacrificing quality. We further present a sequential training scheme to maintain the smoothness and the sparsity of the motion/residual grids. Based on ReRF, we design a special FVV codec that achieves three orders of magnitudes compression rate and provides a companion ReRF player to support online streaming of long-duration FVVs of dynamic scenes. Extensive experiments demonstrate the effectiveness of ReRF for compactly representing dynamic radiance fields, enabling an unprecedented free-viewpoint viewing experience in speed and quality.

* Accepted by CVPR 2023. Project page, see https://aoliao12138.github.io/ReRF/ 
Viaarxiv icon

Human Performance Modeling and Rendering via Neural Animated Mesh

Sep 18, 2022
Fuqiang Zhao, Yuheng Jiang, Kaixin Yao, Jiakai Zhang, Liao Wang, Haizhao Dai, Yuhui Zhong, Yingliang Zhang, Minye Wu, Lan Xu, Jingyi Yu

Figure 1 for Human Performance Modeling and Rendering via Neural Animated Mesh
Figure 2 for Human Performance Modeling and Rendering via Neural Animated Mesh
Figure 3 for Human Performance Modeling and Rendering via Neural Animated Mesh
Figure 4 for Human Performance Modeling and Rendering via Neural Animated Mesh

We have recently seen tremendous progress in the neural advances for photo-real human modeling and rendering. However, it's still challenging to integrate them into an existing mesh-based pipeline for downstream applications. In this paper, we present a comprehensive neural approach for high-quality reconstruction, compression, and rendering of human performances from dense multi-view videos. Our core intuition is to bridge the traditional animated mesh workflow with a new class of highly efficient neural techniques. We first introduce a neural surface reconstructor for high-quality surface generation in minutes. It marries the implicit volumetric rendering of the truncated signed distance field (TSDF) with multi-resolution hash encoding. We further propose a hybrid neural tracker to generate animated meshes, which combines explicit non-rigid tracking with implicit dynamic deformation in a self-supervised framework. The former provides the coarse warping back into the canonical space, while the latter implicit one further predicts the displacements using the 4D hash encoding as in our reconstructor. Then, we discuss the rendering schemes using the obtained animated meshes, ranging from dynamic texturing to lumigraph rendering under various bandwidth settings. To strike an intricate balance between quality and bandwidth, we propose a hierarchical solution by first rendering 6 virtual views covering the performer and then conducting occlusion-aware neural texture blending. We demonstrate the efficacy of our approach in a variety of mesh-based applications and photo-realistic free-view experiences on various platforms, i.e., inserting virtual human performances into real environments through mobile AR or immersively watching talent shows with VR headsets.

* 18 pages, 17 figures 
Viaarxiv icon

Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time

Feb 22, 2022
Liao Wang, Jiakai Zhang, Xinhang Liu, Fuqiang Zhao, Yanshun Zhang, Yingliang Zhang, Minye Wu, Lan Xu, Jingyi Yu

Figure 1 for Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time
Figure 2 for Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time
Figure 3 for Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time
Figure 4 for Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time

Implicit neural representations such as Neural Radiance Field (NeRF) have focused mainly on modeling static objects captured under multi-view settings where real-time rendering can be achieved with smart data structures, e.g., PlenOctree. In this paper, we present a novel Fourier PlenOctree (FPO) technique to tackle efficient neural modeling and real-time rendering of dynamic scenes captured under the free-view video (FVV) setting. The key idea in our FPO is a novel combination of generalized NeRF, PlenOctree representation, volumetric fusion and Fourier transform. To accelerate FPO construction, we present a novel coarse-to-fine fusion scheme that leverages the generalizable NeRF technique to generate the tree via spatial blending. To tackle dynamic scenes, we tailor the implicit network to model the Fourier coefficients of timevarying density and color attributes. Finally, we construct the FPO and train the Fourier coefficients directly on the leaves of a union PlenOctree structure of the dynamic sequence. We show that the resulting FPO enables compact memory overload to handle dynamic objects and supports efficient fine-tuning. Extensive experiments show that the proposed method is 3000 times faster than the original NeRF and achieves over an order of magnitude acceleration over SOTA while preserving high visual quality for the free-viewpoint rendering of unseen dynamic scenes.

* Project page: https://aoliao12138.github.io/FPO/ 
Viaarxiv icon

NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing

Feb 12, 2022
Jiakai Zhang, Liao Wang, Xinhang Liu, Fuqiang Zhao, Minzhang Li, Haizhao Dai, Boyuan Zhang, Wei Yang, Lan Xu, Jingyi Yu

Figure 1 for NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing
Figure 2 for NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing
Figure 3 for NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing
Figure 4 for NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing

Some of the most exciting experiences that Metaverse promises to offer, for instance, live interactions with virtual characters in virtual environments, require real-time photo-realistic rendering. 3D reconstruction approaches to rendering, active or passive, still require extensive cleanup work to fix the meshes or point clouds. In this paper, we present a neural volumography technique called neural volumetric video or NeuVV to support immersive, interactive, and spatial-temporal rendering of volumetric video contents with photo-realism and in real-time. The core of NeuVV is to efficiently encode a dynamic neural radiance field (NeRF) into renderable and editable primitives. We introduce two types of factorization schemes: a hyper-spherical harmonics (HH) decomposition for modeling smooth color variations over space and time and a learnable basis representation for modeling abrupt density and color changes caused by motion. NeuVV factorization can be integrated into a Video Octree (VOctree) analogous to PlenOctree to significantly accelerate training while reducing memory overhead. Real-time NeuVV rendering further enables a class of immersive content editing tools. Specifically, NeuVV treats each VOctree as a primitive and implements volume-based depth ordering and alpha blending to realize spatial-temporal compositions for content re-purposing. For example, we demonstrate positioning varied manifestations of the same performance at different 3D locations with different timing, adjusting color/texture of the performer's clothing, casting spotlight shadows and synthesizing distance falloff lighting, etc, all at an interactive speed. We further develop a hybrid neural-rasterization rendering framework to support consumer-level VR headsets so that the aforementioned volumetric video viewing and editing, for the first time, can be conducted immersively in virtual 3D space.

Viaarxiv icon

iButter: Neural Interactive Bullet Time Generator for Human Free-viewpoint Rendering

Aug 12, 2021
Liao Wang, Ziyu Wang, Pei Lin, Yuheng Jiang, Xin Suo, Minye Wu, Lan Xu, Jingyi Yu

Figure 1 for iButter: Neural Interactive Bullet Time Generator for Human Free-viewpoint Rendering
Figure 2 for iButter: Neural Interactive Bullet Time Generator for Human Free-viewpoint Rendering
Figure 3 for iButter: Neural Interactive Bullet Time Generator for Human Free-viewpoint Rendering
Figure 4 for iButter: Neural Interactive Bullet Time Generator for Human Free-viewpoint Rendering

Generating ``bullet-time'' effects of human free-viewpoint videos is critical for immersive visual effects and VR/AR experience. Recent neural advances still lack the controllable and interactive bullet-time design ability for human free-viewpoint rendering, especially under the real-time, dynamic and general setting for our trajectory-aware task. To fill this gap, in this paper we propose a neural interactive bullet-time generator (iButter) for photo-realistic human free-viewpoint rendering from dense RGB streams, which enables flexible and interactive design for human bullet-time visual effects. Our iButter approach consists of a real-time preview and design stage as well as a trajectory-aware refinement stage. During preview, we propose an interactive bullet-time design approach by extending the NeRF rendering to a real-time and dynamic setting and getting rid of the tedious per-scene training. To this end, our bullet-time design stage utilizes a hybrid training set, light-weight network design and an efficient silhouette-based sampling strategy. During refinement, we introduce an efficient trajectory-aware scheme within 20 minutes, which jointly encodes the spatial, temporal consistency and semantic cues along the designed trajectory, achieving photo-realistic bullet-time viewing experience of human activities. Extensive experiments demonstrate the effectiveness of our approach for convenient interactive bullet-time design and photo-realistic human free-viewpoint video generation.

* Accepted by ACM MM 2021 
Viaarxiv icon

MirrorNeRF: One-shot Neural Portrait RadianceField from Multi-mirror Catadioptric Imaging

Apr 06, 2021
Ziyu Wang, Liao Wang, Fuqiang Zhao, Minye Wu, Lan Xu, Jingyi Yu

Figure 1 for MirrorNeRF: One-shot Neural Portrait RadianceField from Multi-mirror Catadioptric Imaging
Figure 2 for MirrorNeRF: One-shot Neural Portrait RadianceField from Multi-mirror Catadioptric Imaging
Figure 3 for MirrorNeRF: One-shot Neural Portrait RadianceField from Multi-mirror Catadioptric Imaging
Figure 4 for MirrorNeRF: One-shot Neural Portrait RadianceField from Multi-mirror Catadioptric Imaging

Photo-realistic neural reconstruction and rendering of the human portrait are critical for numerous VR/AR applications. Still, existing solutions inherently rely on multi-view capture settings, and the one-shot solution to get rid of the tedious multi-view synchronization and calibration remains extremely challenging. In this paper, we propose MirrorNeRF - a one-shot neural portrait free-viewpoint rendering approach using a catadioptric imaging system with multiple sphere mirrors and a single high-resolution digital camera, which is the first to combine neural radiance field with catadioptric imaging so as to enable one-shot photo-realistic human portrait reconstruction and rendering, in a low-cost and casual capture setting. More specifically, we propose a light-weight catadioptric system design with a sphere mirror array to enable diverse ray sampling in the continuous 3D space as well as an effective online calibration for the camera and the mirror array. Our catadioptric imaging system can be easily deployed with a low budget and the casual capture ability for convenient daily usages. We introduce a novel neural warping radiance field representation to learn a continuous displacement field that implicitly compensates for the misalignment due to our flexible system setting. We further propose a density regularization scheme to leverage the inherent geometry information from the catadioptric data in a self-supervision manner, which not only improves the training efficiency but also provides more effective density supervision for higher rendering quality. Extensive experiments demonstrate the effectiveness and robustness of our scheme to achieve one-shot photo-realistic and high-quality appearance free-viewpoint rendering for human portrait scenes.

Viaarxiv icon