This paper presents a new approach for the recognition of elements in floor plan layouts. Besides of elements with common shapes, we aim to recognize elements with irregular shapes such as circular rooms and inclined walls. Furthermore, the reduction of noise in the semantic segmentation of the floor plan is on demand. To this end, we propose direction-aware, learnable, additive kernels in the application of both the context module and common convolutional blocks. We apply them for high performance of elements with both common and irregular shapes. Besides, an adversarial network with two discriminators is proposed to further improve the accuracy of the elements and to reduce the noise of the semantic segmentation. Experimental results demonstrate the superiority and effectiveness of the proposed network over the state-of-the-art methods.
Graph Neural Network (GNN) is an effective framework for representation learning and prediction for graph structural data. A neighborhood aggregation scheme is applied in the training of GNN and variants, that representation of each node is calculated through recursively aggregating and transforming representation of the neighboring nodes. A variety of GNNS and the variants are build and have achieved state-of-the-art results on both node and graph classification tasks. However, despite common neighborhood which is used in the state-of-the-art GNN models, there is little analysis on the properties of the neighborhood in the neighborhood aggregation scheme. Here, we analyze the properties of the node, edges, and neighborhood of the graph model. Our results characterize the efficiency of the common neighborhood used in the state-of-the-art GNNs, and show that it is not sufficient for the representation learning of the nodes. We propose a simple neighborhood which is likely to be more sufficient. We empirically validate our theoretical analysis on a number of graph classification benchmarks and demonstrate that our methods achieve state-of-the-art performance on listed benchmarks. The implementation code is available at \url{https://github.com/CODE-SUBMIT/Neighborhood-Enlargement-in-Graph-Network}.
We present a simple and general framework for feature learning from point clouds. The key to the success of CNNs is the convolution operator that is capable of leveraging spatially-local correlation in data represented densely in grids (e.g. images). However, point clouds are irregular and unordered, thus directly convolving kernels against features associated with the points, will result in desertion of shape information and variance to point ordering. To address these problems, we propose to learn an $\mathcal{X}$-transformation from the input points, to simultaneously promote two causes. The first is the weighting of the input features associated with the points, and the second is the permutation of the points into a latent and potentially canonical order. Element-wise product and sum operations of the typical convolution operator are subsequently applied on the $\mathcal{X}$-transformed features. The proposed method is a generalization of typical CNNs to feature learning from point clouds, thus we call it PointCNN. Experiments show that PointCNN achieves on par or better performance than state-of-the-art methods on multiple challenging benchmark datasets and tasks.
Generative adversarial models are powerful tools to model structure in complex distributions for a variety of tasks. Current techniques for learning generative models require an access to samples which have high quality, and advanced generative models are applied to generate samples from noisy training data through ambient modules. However, the modules are only practical for the output space of the generator, and their application in the hidden space is not well studied. In this paper, we extend the ambient module to the hidden space of the generator, and provide the uniqueness condition and the corresponding strategy for the ambient hidden generator in the adversarial training process. We report the practicality of the proposed method on the benchmark dataset.
It has been demonstrated that deep neural networks are prone to noisy examples particular adversarial samples during inference process. The gap between robust deep learning systems in real world applications and vulnerable neural networks is still large. Current adversarial training strategies improve the robustness against adversarial samples. However, these methods lead to accuracy reduction when the input examples are clean thus hinders the practicability. In this paper, we investigate an approach that protects the neural network classification from the adversarial samples and improves its accuracy when the input examples are clean. We demonstrate the versatility and effectiveness of our proposed approach on a variety of different networks and datasets.
While recent deep neural networks have achieved promising results for 3D reconstruction from a single-view image, these rely on the availability of RGB textures in images and extra information as supervision. In this work, we propose novel stacked hierarchical networks and an end to end training strategy to tackle a more challenging task for the first time, 3D reconstruction from a single-view 2D silhouette image. We demonstrate that our model is able to conduct 3D reconstruction from a single-view silhouette image both qualitatively and quantitatively. Evaluation is performed using Shapenet for the single-view reconstruction and results are presented in comparison with a single network, to highlight the improvements obtained with the proposed stacked networks and the end to end training strategy. Furthermore, 3D re- construction in forms of IoU is compared with the state of art 3D reconstruction from a single-view RGB image, and the proposed model achieves higher IoU than the state of art of reconstruction from a single view RGB image.