Evaluation of an Optimized K-Means Algorithm Based on Real Data
Citations Over Time
Abstract
In a previous paper Unlike the standard version of the K-Means algorithm that iteratively traverses the entire data set in order to decide to which cluster the data items belong, the proposed optimization relies on the observation that after performing only a few iterations the centroids get very close to their final position causing only a few of the data items to switch their cluster. Therefore, after a small number of iterations, most of the processing time is wasted on checking items that have reached their final cluster. At each iteration, the data items that might switch the cluster due to centroids' deviation will be re-checked. The prototype implementation has been evaluated using data generated based on an uniform distribution random numbers generator. The evaluation showed up to 70% reduction of the running time. This paper will evaluate the optimized K-Means against real data sets from different domains.
Related Papers
- → Calculating Centroids in Constrained Mixture Experiments(1983)26 cited
- → Design of Multi-Grade Centroid Adjusting Mechanism for Simulation Inspection Device(2013)
- New method of centroid detection in wavefront aberration measurement of human eye(2002)
- First moment spot centroid detection with a threshold to compute the centroid(2015)
- → Simulation of centroids of non-circular wheels with internal and external rolling from arcs of symmetrical curves(2020)