Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox


K-Histograms: An Efficient Clustering Algorithm for Categorical Dataset

Sep 13, 2005
Zengyou He, Xiaofei Xu, Shengchun Deng, Bin Dong


Share this with someone who'll enjoy it:


Clustering categorical data is an integral part of data mining and has attracted much attention recently. In this paper, we present k-histogram, a new efficient algorithm for clustering categorical data. The k-histogram algorithm extends the k-means algorithm to categorical domain by replacing the means of clusters with histograms, and dynamically updates histograms in the clustering process. Experimental results on real datasets show that k-histogram algorithm can produce better clustering results than k-modes algorithm, the one related with our work most closely.

* 11 pages 


   Access Paper Source



Share this with someone who'll enjoy it: