Alert button

Variable importance in binary regression trees and forests

Nov 15, 2007
Hemant Ishwaran

Figure 1 for Variable importance in binary regression trees and forests
Figure 2 for Variable importance in binary regression trees and forests
Figure 3 for Variable importance in binary regression trees and forests
Figure 4 for Variable importance in binary regression trees and forests

Share this with someone who'll enjoy it:

We characterize and study variable importance (VIMP) and pairwise variable associations in binary regression trees. A key component involves the node mean squared error for a quantity we refer to as a maximal subtree. The theory naturally extends from single trees to ensembles of trees and applies to methods like random forests. This is useful because while importance values from random forests are used to screen variables, for example they are used to filter high throughput genomic data in Bioinformatics, very little theory exists about their properties.

* Electronic Journal of Statistics 2007, Vol. 1, 519-537  * Published in at http://dx.doi.org/10.1214/07-EJS039 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org)  
View paper onarxiv icon

Share this with someone who'll enjoy it: