Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tianwei Yu

Learning data representation using modified autoencoder for the integrative analysis of multi-omics data

Jun 18, 2019

Tianwei Yu

Figure 1 for Learning data representation using modified autoencoder for the integrative analysis of multi-omics data

Abstract:In integrative analyses of omics data, it is often of interest to extract data embedding from one data type that best reflect relations with another data type. This task is traditionally fulfilled by linear methods such as canonical correlation and partial least squares. However, information contained in one data type pertaining to the other data type may not be in the linear form. Deep learning provides a convenient alternative to extract nonlinear information. Here we develop a method Autoencoder-based Integrative Multi-omics data Embedding (AIME) to extract such information. Using a real gene expression - methylation dataset, we show that AIME extracted meaningful information that the linear approach could not find. The R implementation is available at http://web1.sph.emory.edu/users/tyu8/AIME/.

* 5 pages, 1 figure

Via

Access Paper or Ask Questions

forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction

May 23, 2019

Yunchuan Kong, Tianwei Yu

Figure 1 for forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction

Figure 2 for forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction

Figure 3 for forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction

Figure 4 for forgeNet: A graph deep neural network model using tree-based ensemble classifiers for feature extraction

Abstract:A unique challenge in predictive model building for omics data has been the small number of samples $(n)$ versus the large amount of features $(p)$. This "$n\ll p$" property brings difficulties for disease outcome classification using deep learning techniques. Sparse learning by incorporating external gene network information such as the graph-embedded deep feedforward network (GEDFN) model has been a solution to this issue. However, such methods require an existing feature graph, and potential mis-specification of the feature graph can be harmful on classification and feature selection. To address this limitation and develop a robust classification model without relying on external knowledge, we propose a \underline{for}est \underline{g}raph-\underline{e}mbedded deep feedforward \underline{net}work (forgeNet) model, to integrate the GEDFN architecture with a forest feature graph extractor, so that the feature graph can be learned in a supervised manner and specifically constructed for a given prediction task. To validate the method's capability, we experimented the forgeNet model with both synthetic and real datasets. The resulting high classification accuracy suggests that the method is a valuable addition to sparse deep learning models for omics data.

Via

Access Paper or Ask Questions

Nonlinear variable selection with continuous outcome: a nonparametric incremental forward stagewise approach

May 26, 2018

Tianwei Yu

Figure 1 for Nonlinear variable selection with continuous outcome: a nonparametric incremental forward stagewise approach

Figure 2 for Nonlinear variable selection with continuous outcome: a nonparametric incremental forward stagewise approach

Figure 3 for Nonlinear variable selection with continuous outcome: a nonparametric incremental forward stagewise approach

Figure 4 for Nonlinear variable selection with continuous outcome: a nonparametric incremental forward stagewise approach

Abstract:We present a method of variable selection for the sparse generalized additive model. The method doesn't assume any specific functional form, and can select from a large number of candidates. It takes the form of incremental forward stagewise regression. Given no functional form is assumed, we devised an approach termed roughening to adjust the residuals in the iterations. In simulations, we show the new method is competitive against popular machine learning approaches. We also demonstrate its performance using some real datasets. The method is available as a part of the nlnet package on CRAN https://cran.r-project.org/package=nlnet.

Via

Access Paper or Ask Questions

A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data

Feb 12, 2018

Yunchuan Kong, Tianwei Yu

Figure 1 for A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data

Figure 2 for A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data

Figure 3 for A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data

Figure 4 for A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data

Abstract:Gene expression data represents a unique challenge in predictive model building, because of the small number of samples $(n)$ compared to the huge amount of features $(p)$. This "$n<<p$" property has hampered application of deep learning techniques for disease outcome classification. Sparse learning by incorporating external gene network information could be a potential solution to this issue. Still, the problem is very challenging because (1) there are tens of thousands of features and only hundreds of training samples, (2) the scale-free structure of the gene network is unfriendly to the setup of convolutional neural networks. To address these issues and build a robust classification model, we propose the Graph-Embedded Deep Feedforward Networks (GEDFN), to integrate external relational information of features into the deep neural network architecture. The method is able to achieve sparse connection between network layers to prevent overfitting. To validate the method's capability, we conducted both simulation experiments and a real data analysis using a breast cancer RNA-seq dataset from The Cancer Genome Atlas (TCGA). The resulting high classification accuracy and easily interpretable feature selection results suggest the method is a useful addition to the current classification models and feature selection procedures. The method is available at https://github.com/yunchuankong/NetworkNeuralNetwork.

Via

Access Paper or Ask Questions