This work considers the problem of depth completion, with or without image data, where an algorithm may measure the depth of a prescribed limited number of pixels. The algorithmic challenge is to choose pixel positions strategically and dynamically to maximally reduce overall depth estimation error. This setting is realized in daytime or nighttime depth completion for autonomous vehicles with a programmable LiDAR. Our method uses an ensemble of predictors to define a sampling probability over pixels. This probability is proportional to the variance of the predictions of ensemble members, thus highlighting pixels that are difficult to predict. By additionally proceeding in several prediction phases, we effectively reduce redundant sampling of similar pixels. Our ensemble-based method may be implemented using any depth-completion learning algorithm, such as a state-of-the-art neural network, treated as a black box. In particular, we also present a simple and effective Random Forest-based algorithm, and similarly use its internal ensemble in our design. We conduct experiments on the KITTI dataset, using the neural network algorithm of Ma et al. and our Random Forest based learner for implementing our method. The accuracy of both implementations exceeds the state of the art. Compared with a random or grid sampling pattern, our method allows a reduction by a factor of 4-10 in the number of measurements required to attain the same accuracy.
Depth acquisition, based on active illumination, is essential for autonomous and robotic navigation. LiDARs (Light Detection And Ranging) with mechanical, fixed, sampling templates are commonly used in today's autonomous vehicles. An emerging technology, based on solid-state depth sensors, with no mechanical parts, allows fast, adaptive, programmable scans. In this paper, we investigate the topic of adaptive, image-driven, sampling and reconstruction strategies. First, we formulate a piece-wise linear depth model with several tolerance parameters and estimate its validity for indoor and outdoor scenes. Our model and experiments predict that, in the optimal case, about 20-60 piece-wise linear structures can approximate well a depth map. This translates to a depth-to-image sampling ratio of about 1/1200. We propose a simple, generic, sampling and reconstruction algorithm, based on super-pixels. We reach a sampling rate which is still far from the optimal case. However, our sampling improves grid and random sampling, consistently, for a wide variety of reconstruction methods. Moreover, our proposed reconstruction achieves state-of-the-art results, compared to image-guided depth completion algorithms, reducing the required sampling rate by a factor of 3-4. A single-pixel depth camera built in our lab illustrates the concept.