A robust audio fingerprinting method for content-based copy detection
Citations Over TimeTop 10% of 2014 papers
Abstract
This paper presents a novel audio fingerprinting method that is highly robust to a variety of audio distortions. It is based on unconventional audio fingerprints generation scheme. The robustness is achieved by generating different versions of the spectrogram matrix of the audio signal by using a threshold based on the average of the spectral values to prune this matrix. We transform each version of this pruned spectrogram matrix into a 2-D binary image. Multiple 2-D images suppress noise to a varying degree. This varying degree of noise suppression improves likelihood of one of the images matching a reference image. To speed up matching, we convert each image into an n-dimensional vector, and perform a nearest neighbor search based on this n-dimensional vector. We test this method on TRECVID 2010 content-based copy detection evaluation dataset. Experimental results show the effectiveness of such fingerprints even when the audio is distorted. We compare the proposed method to a state-of-the-art audio copy detection system. Results of this comparison show that our method achieves an improvement of 22% in localization accuracy, and lowers minimal normalized detection cost rate (min NDCR) by half for audio transformations T1 and T2.
Related Papers
- → Acoustic scene classification based on Mel spectrogram decomposition and model merging(2021)66 cited
- → SW-WAVENET: Learning Representation from Spectrogram and Wavegram Using Wavenet for Anomalous Sound Detection(2023)19 cited
- → Using the reassigned spectrogram to obtain a voiceprint(2006)
- → The preliminary application of Gabor spectrogram analysis in speech samples(1993)
- Estimation of Clean Spectrogram Noisy Value Functions Based on Metropolis Iterative Algorithm.(2013)