In this paper, we propose a neural network architecture for scale-invariant semantic segmentation using RGB-D images. We utilize depth information as an additional modality apart from color images only. Especially in an outdoor scene which consists of different scale objects due to the distance of the objects from the camera. The near distance objects consist of significantly more pixels than the far ones. We propose to incorporate depth information to the RGB data for pixel-wise semantic segmentation to address the different scale objects in an outdoor scene. We adapt to a well-known DeepLab-v2(ResNet-101) model as our RGB baseline. Depth images are passed separately as an additional input with a distinct branch. The intermediate feature maps of both color and depth image branch are fused using a novel fusion block. Our model is compact and can be easily applied to the other RGB model. We perform extensive qualitative and quantitative evaluation on a challenging dataset Cityscapes. The results obtained are comparable to the state-of-the-art. Additionally, we evaluated our model on a self-recorded real dataset. For the shake of extended evaluation of a driving scene with ground truth we generated a synthetic dataset using popular vehicle simulation project CARLA. The results obtained from the real and synthetic dataset shows the effectiveness of our approach.
This paper reports on a novel template-free monocular non-rigid surface reconstruction approach. Existing techniques using motion and deformation cues rely on multiple prior assumptions, are often computationally expensive and do not perform equally well across the variety of data sets. In contrast, the proposed Scalable Monocular Surface Reconstruction (SMSR) combines strengths of several algorithms, i.e., it is scalable with the number of points, can handle sparse and dense settings as well as different types of motions and deformations. We estimate camera pose by singular value thresholding and proximal gradient. Our formulation adopts alternating direction method of multipliers which converges in linear time for large point track matrices. In the proposed SMSR, trajectory space constraints are integrated by smoothing of the measurement matrix. In the extensive experiments, SMSR is demonstrated to consistently achieve state-of-the-art accuracy on a wide variety of data sets.