Alert button
Picture for Lap-Fai Yu

Lap-Fai Yu

Alert button

PEARL: Parallelized Expert-Assisted Reinforcement Learning for Scene Rearrangement Planning

May 10, 2021
Hanqing Wang, Zan Wang, Wei Liang, Lap-Fai Yu

Figure 1 for PEARL: Parallelized Expert-Assisted Reinforcement Learning for Scene Rearrangement Planning
Figure 2 for PEARL: Parallelized Expert-Assisted Reinforcement Learning for Scene Rearrangement Planning
Figure 3 for PEARL: Parallelized Expert-Assisted Reinforcement Learning for Scene Rearrangement Planning
Figure 4 for PEARL: Parallelized Expert-Assisted Reinforcement Learning for Scene Rearrangement Planning

Scene Rearrangement Planning (SRP) is an interior task proposed recently. The previous work defines the action space of this task with handcrafted coarse-grained actions that are inflexible to be used for transforming scene arrangement and intractable to be deployed in practice. Additionally, this new task lacks realistic indoor scene rearrangement data to feed popular data-hungry learning approaches and meet the needs of quantitative evaluation. To address these problems, we propose a fine-grained action definition for SRP and introduce a large-scale scene rearrangement dataset. We also propose a novel learning paradigm to efficiently train an agent through self-playing, without any prior knowledge. The agent trained via our paradigm achieves superior performance on the introduced dataset compared to the baseline agents. We provide a detailed analysis of the design of our approach in our experiments.

* 7 pages, 4 figures 
Viaarxiv icon

Designing Human-Robot Coexistence Space

Nov 14, 2020
Jixuan Zhi, Lap-Fai Yu, Jyh-Ming Lien

Figure 1 for Designing Human-Robot Coexistence Space
Figure 2 for Designing Human-Robot Coexistence Space
Figure 3 for Designing Human-Robot Coexistence Space
Figure 4 for Designing Human-Robot Coexistence Space

When the human-robot interactions become ubiquitous, the environment surrounding these interactions will have significant impact on the safety and comfort of the human and the effectiveness and efficiency of the robot. Although most robots are designed to work in the spaces created for humans, many environments, such as living rooms and offices, can be and should be redesigned to enhance and improve human-robot collaboration and interactions. This work uses autonomous wheelchair as an example and investigates the computational design in the human-robot coexistence spaces. Given the room size and the objects $O$ in the room, the proposed framework computes the optimal layouts of $O$ that satisfy both human preferences and navigation constraints of the wheelchair. The key enabling technique is a motion planner that can efficiently evaluate hundreds of similar motion planning problems. Our implementation shows that the proposed framework can produce a design around three to five minutes on average comparing to 10 to 20 minutes without the proposed motion planner. Our results also show that the proposed method produces reasonable designs even for tight spaces and for users with different preferences.

Viaarxiv icon

Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars

Jun 20, 2018
Chenfanfu Jiang, Siyuan Qi, Yixin Zhu, Siyuan Huang, Jenny Lin, Lap-Fai Yu, Demetri Terzopoulos, Song-Chun Zhu

Figure 1 for Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars
Figure 2 for Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars
Figure 3 for Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars
Figure 4 for Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars

We propose a systematic learning-based approach to the generation of massive quantities of synthetic 3D scenes and arbitrary numbers of photorealistic 2D images thereof, with associated ground truth information, for the purposes of training, benchmarking, and diagnosing learning-based computer vision and robotics algorithms. In particular, we devise a learning-based pipeline of algorithms capable of automatically generating and rendering a potentially infinite variety of indoor scenes by using a stochastic grammar, represented as an attributed Spatial And-Or Graph, in conjunction with state-of-the-art physics-based rendering. Our pipeline is capable of synthesizing scene layouts with high diversity, and it is configurable inasmuch as it enables the precise customization and control of important attributes of the generated scenes. It renders photorealistic RGB images of the generated scenes while automatically synthesizing detailed, per-pixel ground truth data, including visible surface depth and normal, object identity, and material information (detailed to object parts), as well as environments (e.g., illuminations and camera viewpoints). We demonstrate the value of our synthesized dataset, by improving performance in certain machine-learning-based scene understanding tasks--depth and surface normal prediction, semantic segmentation, reconstruction, etc.--and by providing benchmarks for and diagnostics of trained models by modifying object attributes and scene properties in a controllable manner.

* Accepted in IJCV 2018 
Viaarxiv icon

A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation

Oct 19, 2016
Duc Thanh Nguyen, Binh-Son Hua, Lap-Fai Yu, Sai-Kit Yeung

Figure 1 for A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation
Figure 2 for A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation
Figure 3 for A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation
Figure 4 for A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation

Recent advances of 3D acquisition devices have enabled large-scale acquisition of 3D scene data. Such data, if completely and well annotated, can serve as useful ingredients for a wide spectrum of computer vision and graphics works such as data-driven modeling and scene understanding, object detection and recognition. However, annotating a vast amount of 3D scene data remains challenging due to the lack of an effective tool and/or the complexity of 3D scenes (e.g. clutter, varying illumination conditions). This paper aims to build a robust annotation tool that effectively and conveniently enables the segmentation and annotation of massive 3D data. Our tool works by coupling 2D and 3D information via an interactive framework, through which users can provide high-level semantic annotation for objects. We have experimented our tool and found that a typical indoor scene could be well segmented and annotated in less than 30 minutes by using the tool, as opposed to a few hours if done manually. Along with the tool, we created a dataset of over a hundred 3D scenes associated with complete annotations using our tool. The tool and dataset are available at www.scenenn.net.

* 14 pages 
Viaarxiv icon