Scalable Complex Query Processing over Large Semantic Web Data Using Cloud
Citations Over TimeTop 24% of 2011 papers
Abstract
Cloud computing solutions continue to grow increasingly popular both in research and in the commercial IT industry. With this popularity comes ever increasing challenges for the cloud computing service providers. Semantic web is another domain of rapid growth in both research and industry. RDF datasets are becoming increasingly large and complex and existing solutions do not scale adequately. In this paper, we will detail a scalable semantic web framework built using cloud computing technologies. We define solutions for generating and executing optimal query plans. We handle not only queries with Basic Graph Patterns (BGP) but also complex queries with optional blocks. We have devised a novel algorithm to handle these complex queries. Our algorithm minimizes binding triple patterns and joins between them by identifying common blocks by algorithms to find sub graph isomorphism and building a query plan utilizing that information. We utilize Hadoop's MapReduce framework to process our query plan. We will show that our framework is extremely scalable and efficiently answers complex queries.
Related Papers
- → Querying in a Workload-Aware Triplestore Based on NoSQL Databases(2019)1 cited
- No regression algorithm for the enumeration of projections in SQL queries with joins and outer joins(1995)
- WPERF+:An Efficient Optimization Algorithm for Distributed Query(2004)
- An efficient optimization algorithm for distributed query(2004)