Hybrid HMM-BLSTM-Based Acoustic Modeling for Automatic Speech Recognition on Quran Recitation
Citations Over Time
Abstract
Nowadays, there are many software applications which assist people to access Quran with their own device. Some of those applications are completed by feature to recognize Quran recitation from the user as well. Therefore, capability of the application to recognize Quran recitation is attracting to be observed. Automatic Speech Recognition (ASR)on Quran recitation is a new research for the past years, compared to English or other spoken languages. For some research, Hidden Markov Model (HMM)- Gaussian Mixture Model (GMM)is still popular to be utilized in acoustic modeling. However, HMM-GMM has a disadvantage in generalizing high-variance data. There is also a problem in solving non-linearly separable data. To tackle those problems, a new method to train the acoustic model for Quran speech recognition with deep learning approach was proposed in this paper. Bidirectional Long-Short Term Memory (BLSTM)as one of deep learning topologies was used in the experiment. This topology was combined with HMM as a hybrid system. In some research, this method had worked well for another language e.g. English speech recognition. In general, the research result showed that this method was also working greatly to Quran speech recognition compared to our baseline system with HMM-GMM. For baseline models, the average result of WER was 18.39%. On the other hand, our experimental model (acoustic model with Hybrid HMM-BLSTM)showed a far better result, with average WER value 4.63% for the same testing scenario. In this research also, Quran recitation style effect was also analyzed by building the model which depended on Quran recitation style (Maqam).
Related Papers
- → Development of HMM Based Automatic Speech Recognition System for Indian English(2018)3 cited
- → Speech Recognition System and Isolated Word Recognition based on Hidden Markov Model (HMM) for Hearing Impaired(2013)14 cited
- → Ground vehicle classification based on Hierarchical Hidden Markov Model and Gaussian Mixture Model using wireless sensor networks(2010)7 cited
- → Switching GMM-HMM for Complex Human Activity Modeling and Recognition(2022)2 cited
- → Stressed speech recognition using multi-dimensional hidden Markov models(2002)1 cited