Chameleon 2
Citations Over TimeTop 10% of 2019 papers
Abstract
Traditional clustering algorithms fail to produce human-like results when confronted with data of variable density, complex distributions, or in the presence of noise. We propose an improved graph-based clustering algorithm called Chameleon 2, which overcomes several drawbacks of state-of-the-art clustering approaches. We modified the internal cluster quality measure and added an extra step to ensure algorithm robustness. Our results reveal a significant positive impact on the clustering quality measured by Normalized Mutual Information on 32 artificial datasets used in the clustering literature. This significant improvement is also confirmed on real-world datasets. The performance of clustering algorithms such as DBSCAN is extremely parameter sensitive, and exhaustive manual parameter tuning is necessary to obtain a meaningful result. All hierarchical clustering methods are very sensitive to cutoff selection, and a human expert is often required to find the true cutoff for each clustering result. We present an automated cutoff selection method that enables the Chameleon 2 algorithm to generate high-quality clustering in autonomous mode.
Related Papers
- → GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases(2004)59 cited
- → Performance Comparison with Hierarchical and Partitional Clustering Methods(2021)2 cited
- Novel clustering algorithm based on grid and density(2011)
- → OPTIMIZATION OF DBSCAN ALGORITHM USING MAP REDUCE METHOD ON NETWORK TRAFFIC DATA(2019)
- → Analyzing Of Clustering Algorithms for Achieving High Evaluation Metrics(2021)