The Bloom paradox: When not to use a Bloom filter?
Citations Over TimeTop 10% of 2012 papers
Abstract
In this paper, we uncover the Bloom paradox in Bloom filters: sometimes, it is better to disregard the query results of Bloom filters, and in fact not to even query them, thus making them useless. We first analyze conditions under which the Bloom paradox occurs in a Bloom filter, and demonstrate that it depends on the a priori probability that a given element belongs to the represented set. We show that the Bloom paradox also applies to Counting Bloom Filters (CBFs), and depends on the product of the hashed counters of each element. In addition, both for Bloom filters and CBFs, we suggest improved architectures that deal with the Bloom paradox. We also provide fundamental memory lower bounds required to support element queries with limited false-positive and false-negative rates. Last, using simulations, we verify our theoretical results, and show that our improved schemes can lead to a significant improvement in the performance of Bloom filters and CBFs.
Related Papers
- → Using Parallel Bloom Filters for Multiattribute Representation on Network Services(2009)48 cited
- → TinySet—An Access Efficient Self Adjusting Bloom Filter Construction(2017)39 cited
- → Research and Application on Bloom Filter(2009)6 cited
- → TinySet - An Access Efficient Self Adjusting Bloom Filter Construction(2015)6 cited
- → Efficient false positive free set synchronization using an extended bloom filter approach(2013)3 cited