Efficient Large-Scale Similarity Search Using Matrix Factorization
Citations Over TimeTop 10% of 2016 papers
Abstract
We consider the image retrieval problem of finding the images in a dataset that are most similar to a query image. Our goal is to reduce the number of vector operations and memory for performing a search without sacrificing accuracy of the returned images. We adopt a group testing formulation and design the decoding architecture using either dictionary learning or eigendecomposition. The latter is a plausible option for small-to-medium sized problems with high-dimensional global image descriptors, whereas dictionary learning is applicable in large-scale scenario. We evaluate our approach both for global descriptors obtained from SIFT and CNN features. Experiments with standard image search benchmarks, including the Yahoo100M dataset comprising 100 million images, show that our method gives comparable (and sometimes superior) accuracy compared to exhaustive search while requiring only 10% of the vector operations and memory. Moreover, for the same search complexity, our method gives significantly better accuracy compared to approaches based on dimensionality reduction or locality sensitive hashing.
Related Papers
- → LSH vs Randomized Partition Trees: Which One to Use for Nearest Neighbor Search?(2014)13 cited
- → Large-Scale Distributed Locality-Sensitive Hashing for General Metric Data(2014)9 cited
- → Theoretical analysis on pruning nearest neighbor candidates by locality sensitive hashing(2010)
- Review on Locality Sensitive Hashing in Centralized Environment(2015)