Alert button
Picture for Mingchao Sun

Mingchao Sun

Alert button

DO-Conv: Depthwise Over-parameterized Convolutional Layer

Jun 22, 2020
Jinming Cao, Yangyan Li, Mingchao Sun, Ying Chen, Dani Lischinski, Daniel Cohen-Or, Baoquan Chen, Changhe Tu

Figure 1 for DO-Conv: Depthwise Over-parameterized Convolutional Layer
Figure 2 for DO-Conv: Depthwise Over-parameterized Convolutional Layer
Figure 3 for DO-Conv: Depthwise Over-parameterized Convolutional Layer
Figure 4 for DO-Conv: Depthwise Over-parameterized Convolutional Layer

Convolutional layers are the core building blocks of Convolutional Neural Networks (CNNs). In this paper, we propose to augment a convolutional layer with an additional depthwise convolution, where each input channel is convolved with a different 2D kernel. The composition of the two convolutions constitutes an over-parameterization, since it adds learnable parameters, while the resulting linear operation can be expressed by a single convolution layer. We refer to this depthwise over-parameterized convolutional layer as DO-Conv. We show with extensive experiments that the mere replacement of conventional convolutional layers with DO-Conv layers boosts the performance of CNNs on many classical vision tasks, such as image classification, detection, and segmentation. Moreover, in the inference phase, the depthwise convolution is folded into the conventional convolution, reducing the computation to be exactly equivalent to that of a convolutional layer without over-parameterization. As DO-Conv introduces performance gains without incurring any computational complexity increase for inference, we advocate it as an alternative to the conventional convolutional layer. We open-source a reference implementation of DO-Conv in Tensorflow, PyTorch and GluonCV at https://github.com/yangyanli/DO-Conv.

Viaarxiv icon

Neighborhood Enlargement in Graph Neural Networks

May 21, 2019
Xinhan Di, Pengqian Yu, Mingchao Sun, Rui Bu

Figure 1 for Neighborhood Enlargement in Graph Neural Networks
Figure 2 for Neighborhood Enlargement in Graph Neural Networks
Figure 3 for Neighborhood Enlargement in Graph Neural Networks
Figure 4 for Neighborhood Enlargement in Graph Neural Networks

Graph Neural Network (GNN) is an effective framework for representation learning and prediction for graph structural data. A neighborhood aggregation scheme is applied in the training of GNN and variants, that representation of each node is calculated through recursively aggregating and transforming representation of the neighboring nodes. A variety of GNNS and the variants are build and have achieved state-of-the-art results on both node and graph classification tasks. However, despite common neighborhood which is used in the state-of-the-art GNN models, there is little analysis on the properties of the neighborhood in the neighborhood aggregation scheme. Here, we analyze the properties of the node, edges, and neighborhood of the graph model. Our results characterize the efficiency of the common neighborhood used in the state-of-the-art GNNs, and show that it is not sufficient for the representation learning of the nodes. We propose a simple neighborhood which is likely to be more sufficient. We empirically validate our theoretical analysis on a number of graph classification benchmarks and demonstrate that our methods achieve state-of-the-art performance on listed benchmarks. The implementation code is available at \url{https://github.com/CODE-SUBMIT/Neighborhood-Enlargement-in-Graph-Network}.

Viaarxiv icon

PointCNN: Convolution On $\mathcal{X}$-Transformed Points

Nov 05, 2018
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, Baoquan Chen

Figure 1 for PointCNN: Convolution On $\mathcal{X}$-Transformed Points
Figure 2 for PointCNN: Convolution On $\mathcal{X}$-Transformed Points
Figure 3 for PointCNN: Convolution On $\mathcal{X}$-Transformed Points
Figure 4 for PointCNN: Convolution On $\mathcal{X}$-Transformed Points

We present a simple and general framework for feature learning from point clouds. The key to the success of CNNs is the convolution operator that is capable of leveraging spatially-local correlation in data represented densely in grids (e.g. images). However, point clouds are irregular and unordered, thus directly convolving kernels against features associated with the points, will result in desertion of shape information and variance to point ordering. To address these problems, we propose to learn an $\mathcal{X}$-transformation from the input points, to simultaneously promote two causes. The first is the weighting of the input features associated with the points, and the second is the permutation of the points into a latent and potentially canonical order. Element-wise product and sum operations of the typical convolution operator are subsequently applied on the $\mathcal{X}$-transformed features. The proposed method is a generalization of typical CNNs to feature learning from point clouds, thus we call it PointCNN. Experiments show that PointCNN achieves on par or better performance than state-of-the-art methods on multiple challenging benchmark datasets and tasks.

* To be published in NIPS 2018, code available at https://github.com/yangyanli/PointCNN 
Viaarxiv icon

Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55

Oct 27, 2017
Li Yi, Lin Shao, Manolis Savva, Haibin Huang, Yang Zhou, Qirui Wang, Benjamin Graham, Martin Engelcke, Roman Klokov, Victor Lempitsky, Yuan Gan, Pengyu Wang, Kun Liu, Fenggen Yu, Panpan Shui, Bingyang Hu, Yan Zhang, Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Minki Jeong, Jaehoon Choi, Changick Kim, Angom Geetchandra, Narasimha Murthy, Bhargava Ramu, Bharadwaj Manda, M Ramanathan, Gautam Kumar, P Preetham, Siddharth Srivastava, Swati Bhugra, Brejesh Lall, Christian Haene, Shubham Tulsiani, Jitendra Malik, Jared Lafer, Ramsey Jones, Siyuan Li, Jie Lu, Shi Jin, Jingyi Yu, Qixing Huang, Evangelos Kalogerakis, Silvio Savarese, Pat Hanrahan, Thomas Funkhouser, Hao Su, Leonidas Guibas

Figure 1 for Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55
Figure 2 for Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55
Figure 3 for Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55
Figure 4 for Large-Scale 3D Shape Reconstruction and Segmentation from ShapeNet Core55

We introduce a large-scale 3D shape understanding benchmark using data and annotation from ShapeNet 3D object database. The benchmark consists of two tasks: part-level segmentation of 3D shapes and 3D reconstruction from single view images. Ten teams have participated in the challenge and the best performing teams have outperformed state-of-the-art approaches on both tasks. A few novel deep learning architectures have been proposed on various 3D representations on both tasks. We report the techniques used by each team and the corresponding performances. In addition, we summarize the major discoveries from the reported results and possible trends for the future work in the field.

Viaarxiv icon