Volume of text based documents have been increasing day by day. Medical documents are located within this growing text documents. In this study, the techniques used for text classification applied on medical documents and evaluated classification performance. Used data sets are multi class and multi labelled. Chi Square (CHI) technique was used for feature selection also SMO, NB, C4.5, RF and KNN algorithms was used for classification. The aim of this study, success of various classifiers is evaluated on multi class and multi label data sets consisting of medical documents. The first 400 features, while the most successful in the KNN classifier, feature number 400 and after the SMO has become the most successful classifier.
With the widespread use of the internet, the size of the text data increases day by day. Poems can be given as an example of the growing text. In this study, we aim to classify poetry according to poet. Firstly, data set consisting of three different poetry of poets written in English have been constructed. Then, text categorization techniques are implemented on it. Chi-Square technique are used for feature selection. In addition, five different classification algorithms are tried. These algorithms are Sequential minimal optimization, Naive Bayes, C4.5 decision tree, Random Forest and k-nearest neighbors. Although each classifier showed very different results, over the 70% classification success rate was taken by sequential minimal optimization technique.