Abstract
•Discovering clusters with varied density is challenging task.•The MDCUT algorithm discovers varied density and arbitrary shaped clusters.•K-nearest neighbors plot shows the distribution of densities among a data set.•Interpolation function approximates K-nearest neighbors distances.•Local density thresholds are extracted from interpolated k-nearest neighbors curve.
Building upon the promising performances of density-based clustering, we present a novel density-based clustering algorithm called MDCUT (MultiDensity ClUsTering). The presented algorithm has the merit of clustering data with varied density. It operates in two phases. First, it finds the appropriate number of density levels in a data set; to do so, it uses the exponential spline mathematical process on the k-nearest neighbors’ distance. Secondly, it uses these levels as local density thresholds to determine clusters with varied densities and arbitrary shapes. We show experimentally that the MDCUT algorithm detects correctly the density levels in a data set and succeeds to discover arbitrarily shaped clusters in decreasing density order. We validate the clustering results in terms of clustering error, precision and recall rates on various data sets. MDCUT performs well in comparison to several other clustering algorithms among which the DBSCAN algorithm.
[Display omitted]