Alert button
Picture for Yuanbo Yang

Yuanbo Yang

Alert button

UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields

Mar 28, 2023
Yuanbo Yang, Yifei Yang, Hanlei Guo, Rong Xiong, Yue Wang, Yiyi Liao

Figure 1 for UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields
Figure 2 for UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields
Figure 3 for UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields
Figure 4 for UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative Neural Feature Fields

Generating photorealistic images with controllable camera pose and scene contents is essential for many applications including AR/VR and simulation. Despite the fact that rapid progress has been made in 3D-aware generative models, most existing methods focus on object-centric images and are not applicable to generating urban scenes for free camera viewpoint control and scene editing. To address this challenging task, we propose UrbanGIRAFFE, which uses a coarse 3D panoptic prior, including the layout distribution of uncountable stuff and countable objects, to guide a 3D-aware generative model. Our model is compositional and controllable as it breaks down the scene into stuff, objects, and sky. Using stuff prior in the form of semantic voxel grids, we build a conditioned stuff generator that effectively incorporates the coarse semantic and geometry information. The object layout prior further allows us to learn an object generator from cluttered scenes. With proper loss functions, our approach facilitates photorealistic 3D-aware image synthesis with diverse controllability, including large camera movement, stuff editing, and object manipulation. We validate the effectiveness of our model on both synthetic and real-world datasets, including the challenging KITTI-360 dataset.

* Project page: https://lv3d.github.io/urbanGIRAFFE 
Viaarxiv icon

InvNorm: Domain Generalization for Object Detection in Gastrointestinal Endoscopy

May 05, 2022
Weichen Fan, Yuanbo Yang, Kunpeng Qiu, Shuo Wang, Yongxin Guo

Figure 1 for InvNorm: Domain Generalization for Object Detection in Gastrointestinal Endoscopy
Figure 2 for InvNorm: Domain Generalization for Object Detection in Gastrointestinal Endoscopy
Figure 3 for InvNorm: Domain Generalization for Object Detection in Gastrointestinal Endoscopy
Figure 4 for InvNorm: Domain Generalization for Object Detection in Gastrointestinal Endoscopy

Domain Generalization is a challenging topic in computer vision, especially in Gastrointestinal Endoscopy image analysis. Due to several device limitations and ethical reasons, current open-source datasets are typically collected on a limited number of patients using the same brand of sensors. Different brands of devices and individual differences will significantly affect the model's generalizability. Therefore, to address the generalization problem in GI(Gastrointestinal) endoscopy, we propose a multi-domain GI dataset and a light, plug-in block called InvNorm(Invertible Normalization), which could achieve a better generalization performance in any structure. Previous DG(Domain Generalization) methods fail to achieve invertible transformation, which would lead to some misleading augmentation. Moreover, these models would be more likely to lead to medical ethics issues. Our method utilizes normalizing flow to achieve invertible and explainable style normalization to address the problem. The effectiveness of InvNorm is demonstrated on a wide range of tasks, including GI recognition, GI object detection, and natural image recognition.

Viaarxiv icon