Publication Title : Slice_OP: Selecting Initial Cluster Centers Using Observation Points
Publicationed By : Dr. Md Abdul Masud
Publication Publication Date : 2018-12-29 00:00:00
Publication Online Link : https://link.springer.com/chapter/10.1007%2F978-3-030-05090-0_2
Publication Description :
This paper proposes a new algorithm, Slice_OP, which selects the initial cluster centers on high-dimensional data. A set of observation points is allocated to transform the high-dimensional data into one-dimensional distance data. Multiple Gamma models are built on distance data, which are fitted with the expectation-maximization algorithm. The best-fitted model is selected with the second-order Akaike information criterion. We estimate the candidate initial centers from the objects in each component of the best-fitted model. A cluster tree is built based on the distance matrix of candidate initial centers and the cluster tree is divided into K branches. Objects in each branch are analyzed with k-nearest neighbor algorithm to select initial cluster centers. The experimental results show that the Slice_OP algorithm outperformed the state-of-the-art Kmeans++ algorithm and random center initialization in the k-means algorithm on synthetic and real-world datasets.