Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Makoto Aoshima

Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models

Oct 30, 2017

Makoto Aoshima, Kazuyoshi Yata

Figure 1 for Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models

Figure 2 for Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models

Figure 3 for Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models

Figure 4 for Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models

Abstract:We consider classifiers for high-dimensional data under the strongly spiked eigenvalue (SSE) model. We first show that high-dimensional data often have the SSE model. We consider a distance-based classifier using eigenstructures for the SSE model. We apply the noise reduction methodology to estimation of the eigenvalues and eigenvectors in the SSE model. We create a new distance-based classifier by transforming data from the SSE model to the non-SSE model. We give simulation studies and discuss the performance of the new classifier. Finally, we demonstrate the new classifier by using microarray data sets.

* 29 pages, 4 figures

Via

Access Paper or Ask Questions

Support vector machine and its bias correction in high-dimension, low-sample-size settings

Feb 26, 2017

Yugo Nakayama, Kazuyoshi Yata, Makoto Aoshima

Figure 1 for Support vector machine and its bias correction in high-dimension, low-sample-size settings

Figure 2 for Support vector machine and its bias correction in high-dimension, low-sample-size settings

Figure 3 for Support vector machine and its bias correction in high-dimension, low-sample-size settings

Figure 4 for Support vector machine and its bias correction in high-dimension, low-sample-size settings

Abstract:In this paper, we consider asymptotic properties of the support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. We show that the hard-margin linear SVM holds a consistency property in which misclassification rates tend to zero as the dimension goes to infinity under certain severe conditions. We show that the SVM is very biased in HDLSS settings and its performance is affected by the bias directly. In order to overcome such difficulties, we propose a bias-corrected SVM (BC-SVM). We show that the BC-SVM gives preferable performances in HDLSS settings. We also discuss the SVMs in multiclass HDLSS settings. Finally, we check the performance of the classifiers in actual data analyses.

* 23 pages, 3 figures

Via

Access Paper or Ask Questions

High-dimensional quadratic classifiers in non-sparse settings

Aug 21, 2015

Makoto Aoshima, Kazuyoshi Yata

Figure 1 for High-dimensional quadratic classifiers in non-sparse settings

Figure 2 for High-dimensional quadratic classifiers in non-sparse settings

Figure 3 for High-dimensional quadratic classifiers in non-sparse settings

Figure 4 for High-dimensional quadratic classifiers in non-sparse settings

Abstract:We consider high-dimensional quadratic classifiers in non-sparse settings. The target of classification rules is not Bayes error rates in the context. The classifier based on the Mahalanobis distance does not always give a preferable performance even if the populations are normal distributions having known covariance matrices. The quadratic classifiers proposed in this paper draw information about heterogeneity effectively through both the differences of expanding mean vectors and covariance matrices. We show that they hold a consistency property in which misclassification rates tend to zero as the dimension goes to infinity under non-sparse settings. We verify that they are asymptotically distributed as a normal distribution under certain conditions. We also propose a quadratic classifier after feature selection by using both the differences of mean vectors and covariance matrices. Finally, we discuss performances of the classifiers in actual data analyses. The proposed classifiers achieve highly accurate classification with very low computational costs.

* 36 pages, 4 figures

Via

Access Paper or Ask Questions