Semantic hashing using tags and topic modeling
Citations Over TimeTop 10% of 2013 papers
Abstract
It is an important research problem to design efficient and effective solutions for large scale similarity search. One popular strategy is to represent data examples as compact binary codes through semantic hashing, which has produced promising results with fast search speed and low storage cost. Many existing semantic hashing methods generate binary codes for documents by modeling document relationships based on similarity in a keyword feature space. Two major limitations in existing methods are: (1) Tag information is often associated with documents in many real world applications, but has not been fully exploited yet; (2) The similarity in keyword feature space does not fully reflect semantic relationships that go beyond keyword matching.
Related Papers
- → SES-LSH: Shuffle-Efficient Locality Sensitive Hashing for Distributed Similarity Search(2017)16 cited
- → LSH vs Randomized Partition Trees: Which One to Use for Nearest Neighbor Search?(2014)13 cited
- → Large-Scale Distributed Locality-Sensitive Hashing for General Metric Data(2014)9 cited
- → Theoretical analysis on pruning nearest neighbor candidates by locality sensitive hashing(2010)