Cluster-Driven Navigation of the Query Space
Citations Over TimeTop 10% of 2016 papers
Abstract
How can users who know neither programming nor statistics explore large databases? We present a novel interface, designed to guide explorers through their data: Blaeu. Blaeu is a database front-end, “boosted” with unsupervised learning primitives. Thanks to these primitives, it can summarize and recommend queries. Our first contribution is Blaeu's interaction model. With Blaeu, users explore the data through data maps. A data map is an interactive set of clusters, which users navigate with zooms and projections. Our second contribution is Blaeu's engine. We present three mapping algorithms, for three different settings. The first algorithm deals with small to medium databases, the second one targets high dimensional spaces, and the last one focuses on speed and interaction. We then present an optimization strategy based on sampling. Our experiments reveal that Blaeu can cluster millions of tuples with hundreds of columns in a few seconds on commodity hardware.
Related Papers
- → Maximal and minimal coverings of (k− 1)-tuples byk-tuples(1968)56 cited
- → On the Problem of Over-clustering in Tuple-based Coordination Systems(2007)5 cited
- → The LDV approach to polyinstantiation(2002)1 cited
- → A bi-threshold model for PP-attachment disambiguation through backing off to 2-tuples directly(2009)
- → Tuples in C++ and their Applications(2018)