Speech Source Separation Using Variational Autoencoder and Bandpass Filter
Citations Over TimeTop 10% of 2020 papers
Abstract
Speech source separation is essential for speech-related applications because this process enhances the input speech signal for the main processing model. Most of the current approaches for this task focus on separating the speech of commonly high-frequency noises or a particular background sound. They cannot clear the signals which intersect with the human speech in its frequency range. To deal with this problem, we propose a hybrid approach combining a variational autoencoder (VAE) and a bandpass filter (BPF). This method can extract and enhance the speech signal in the mixture of many elements such as speech signal, the high-frequency noises, and many kinds of different background sounds which interfere with the speech sound. Experimental results showed that our model can extract effectively the speech signal with 15.02 dB in Signal to Interference Ratio (SIR) and 12.99 dB in Signal to Distortion Ratio (SDR). On the other hand, we can adjust the passband to identify the range of frequency at the output signal to apply for a particular application like gender classification.
Related Papers
- → Research Progress in Speech Enhancement Technology(2020)2 cited
- → Speech enhancement based on a combined multi-channel array with constrained iterative and auditory masked processing(2004)5 cited
- → An Iterative Post-processing Approach for Speech Enhancement(2019)2 cited
- → Speech enhancement using pre-processing(2002)1 cited
- → Speech enhancement using pre-processing(1999)