How to Track Your Data: Rule-Based Data Provenance Tracing Algorithms
Citations Over TimeTop 10% of 2012 papers
Abstract
As cloud computing and virtualization technologies become mainstream, the need to be able to track data has grown in importance. Having the ability to track data from its creation to its current state or its end state will enable the full transparency and accountability in cloud computing environments. In this paper, we showcase a novel technique for tracking end-to-end data provenance, a meta-data describing the derivation history of data. This breakthrough is crucial as it enhances trust and security for complex computer systems and communication networks. By analyzing and utilizing provenance, it is possible to detect various data leakage threats and alert data administrators and owners; thereby addressing the increasing needs of trust and security for customers' data. We also present our rule-based data provenance tracing algorithms, which trace data provenance to detect actual operations that have been performed on files, especially those under the threat of leaking customers' data. We implemented the cloud data provenance algorithms into an existing software with a rule correlation engine, show the performance of the algorithms in detecting various data leakage threats, and discuss technically its capabilities and limitations.
Related Papers
- → Big Data Analysis using Apache Hadoop and Spark(2019)5 cited
- → Big Data 4.0: The Era of Big Intelligence(2024)4 cited
- → Big Data 4.0 = Meta4 (Big Data) = The Era of Big Intelligence(2024)3 cited
- → Study on Belt Skew. 2nd Report. Effects of Parameters and Skew Mechanism.(2001)9 cited
- → On two-sided skew braces(2022)