Distributed outlier detection in hierarchically structured datasets with mixed attributes
Citations Over Time
Abstract
Anomaly detection has been extensively studied over the past decades; however, there are still various challenges due to the complex structures of the real-world datasets. First, only a few methods in the literature provide insight into the datasets that have both categorical and continuous attributes, and even fewer of them are sensitive to the dependencies between the two types of attributes. Second, a real-world dataset tends to be more complex in its structure, and the categorical attributes are usually hierarchically correlated, which has been largely ignored by the existing outlier detection approaches. Following this line of reasoning, we propose a distributed outlier detection method for mixed attribute datasets, especially with hierarchical categorical attributes. The proposed method accounts for the dependencies between categorical and continuous attributes rather than treating them as two separate parts. In addition, the proposed method is able to capture the hierarchical structure among categorical attributes. The experimental results on a real-world dataset and a simulation study show its superior performance in terms of both the detection accuracy and time efficiency.
Related Papers
- Outlier detection in stream data by clustering method(2013)
- → Detecting Current Outliers: Continuous Outlier Detection over Time-Series Data Streams(2008)20 cited
- → Association rules based algorithm for identifying outlier transactions in data stream(2012)7 cited
- Outlier Detection Algorithms and their Availability to Data Streams Mining(2007)
- Analysis of Clustering Algorithm for Outlier Detection in Data Stream(2016)