Alert button
Picture for Tian Jin

Tian Jin

Alert button

The Effect of Data Dimensionality on Neural Network Prunability

Dec 01, 2022
Zachary Ankner, Alex Renda, Gintare Karolina Dziugaite, Jonathan Frankle, Tian Jin

Figure 1 for The Effect of Data Dimensionality on Neural Network Prunability
Figure 2 for The Effect of Data Dimensionality on Neural Network Prunability
Figure 3 for The Effect of Data Dimensionality on Neural Network Prunability
Figure 4 for The Effect of Data Dimensionality on Neural Network Prunability

Practitioners prune neural networks for efficiency gains and generalization improvements, but few scrutinize the factors determining the prunability of a neural network the maximum fraction of weights that pruning can remove without compromising the model's test accuracy. In this work, we study the properties of input data that may contribute to the prunability of a neural network. For high dimensional input data such as images, text, and audio, the manifold hypothesis suggests that these high dimensional inputs approximately lie on or near a significantly lower dimensional manifold. Prior work demonstrates that the underlying low dimensional structure of the input data may affect the sample efficiency of learning. In this paper, we investigate whether the low dimensional structure of the input data affects the prunability of a neural network.

Viaarxiv icon

Pruning's Effect on Generalization Through the Lens of Training and Regularization

Oct 25, 2022
Tian Jin, Michael Carbin, Daniel M. Roy, Jonathan Frankle, Gintare Karolina Dziugaite

Figure 1 for Pruning's Effect on Generalization Through the Lens of Training and Regularization
Figure 2 for Pruning's Effect on Generalization Through the Lens of Training and Regularization
Figure 3 for Pruning's Effect on Generalization Through the Lens of Training and Regularization
Figure 4 for Pruning's Effect on Generalization Through the Lens of Training and Regularization

Practitioners frequently observe that pruning improves model generalization. A long-standing hypothesis based on bias-variance trade-off attributes this generalization improvement to model size reduction. However, recent studies on over-parameterization characterize a new model size regime, in which larger models achieve better generalization. Pruning models in this over-parameterized regime leads to a contradiction -- while theory predicts that reducing model size harms generalization, pruning to a range of sparsities nonetheless improves it. Motivated by this contradiction, we re-examine pruning's effect on generalization empirically. We show that size reduction cannot fully account for the generalization-improving effect of standard pruning algorithms. Instead, we find that pruning leads to better training at specific sparsities, improving the training loss over the dense model. We find that pruning also leads to additional regularization at other sparsities, reducing the accuracy degradation due to noisy examples over the dense model. Pruning extends model training time and reduces model size. These two factors improve training and add regularization respectively. We empirically demonstrate that both factors are essential to fully explaining pruning's impact on generalization.

* Advances in Neural Information Processing Systems 2022  
* 49 pages, 20 figures 
Viaarxiv icon

Compiling ONNX Neural Network Models Using MLIR

Oct 01, 2020
Tian Jin, Gheorghe-Teodor Bercea, Tung D. Le, Tong Chen, Gong Su, Haruki Imai, Yasushi Negishi, Anh Leu, Kevin O'Brien, Kiyokuni Kawachiya, Alexandre E. Eichenberger

Figure 1 for Compiling ONNX Neural Network Models Using MLIR
Figure 2 for Compiling ONNX Neural Network Models Using MLIR
Figure 3 for Compiling ONNX Neural Network Models Using MLIR

Deep neural network models are becoming increasingly popular and have been used in various tasks such as computer vision, speech recognition, and natural language processing. Machine learning models are commonly trained in a resource-rich environment and then deployed in a distinct environment such as high availability machines or edge devices. To assist the portability of models, the open-source community has proposed the Open Neural Network Exchange (ONNX) standard. In this paper, we present a high-level, preliminary report on our onnx-mlir compiler, which generates code for the inference of deep neural network models described in the ONNX format. Onnx-mlir is an open-source compiler implemented using the Multi-Level Intermediate Representation (MLIR) infrastructure recently integrated in the LLVM project. Onnx-mlir relies on the MLIR concept of dialects to implement its functionality. We propose here two new dialects: (1) an ONNX specific dialect that encodes the ONNX standard semantics, and (2) a loop-based dialect to provide for a common lowering point for all ONNX dialect operations. Each intermediate representation facilitates its own characteristic set of graph-level and loop-based optimizations respectively. We illustrate our approach by following several models through the proposed representations and we include some early optimization work and performance results.

* 8 pages 
Viaarxiv icon

Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model

Oct 26, 2019
Liang Shen, Jiahua Zhu, Chongyi Fan, Xiaotao Huang, Tian Jin

Figure 1 for Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model
Figure 2 for Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model
Figure 3 for Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model
Figure 4 for Novel Co-variant Feature Point Matching Based on Gaussian Mixture Model

The feature frame is a key idea of feature matching problem between two images. However, most of the traditional matching methods only simply employ the spatial location information (the coordinates), which ignores the shape and orientation information of the local feature. Such additional information can be obtained along with coordinates using general co-variant detectors such as DOG, Hessian, Harris-Affine and MSER. In this paper, we develop a novel method considering all the feature center position coordinates, the local feature shape and orientation information based on Gaussian Mixture Model for co-variant feature matching. We proposed three sub-versions in our method for solving the matching problem in different conditions: rigid, affine and non-rigid, respectively, which all optimized by expectation maximization algorithm. Due to the effective utilization of the additional shape and orientation information, the proposed model can significantly improve the performance in terms of convergence speed and recall. Besides, it is more robust to the outliers.

* arXiv admin note: text overlap with arXiv:0905.2635 by other authors 
Viaarxiv icon

Clustering Bioactive Molecules in 3D Chemical Space with Unsupervised Deep Learning

Feb 09, 2019
Chu Qin, Ying Tan, Shang Ying Chen, Xian Zeng, Xingxing Qi, Tian Jin, Huan Shi, Yiwei Wan, Yu Chen, Jingfeng Li, Weidong He, Yali Wang, Peng Zhang, Feng Zhu, Hongping Zhao, Yuyang Jiang, Yuzong Chen

Unsupervised clustering has broad applications in data stratification, pattern investigation and new discovery beyond existing knowledge. In particular, clustering of bioactive molecules facilitates chemical space mapping, structure-activity studies, and drug discovery. These tasks, conventionally conducted by similarity-based methods, are complicated by data complexity and diversity. We ex-plored the superior learning capability of deep autoencoders for unsupervised clustering of 1.39 mil-lion bioactive molecules into band-clusters in a 3-dimensional latent chemical space. These band-clusters, displayed by a space-navigation simulation software, band molecules of selected bioactivity classes into individual band-clusters possessing unique sets of common sub-structural features beyond structural similarity. These sub-structural features form the frameworks of the literature-reported pharmacophores and privileged fragments. Within each band-cluster, molecules are further banded into selected sub-regions with respect to their bioactivity target, sub-structural features and molecular scaffolds. Our method is potentially applicable for big data clustering tasks of different fields.

Viaarxiv icon