Improving Mispronunciation Detection of Mandarin for Tibetan Students Based on the End-To-End Speech Recognition Model
Citations Over Time
Abstract
Mispronunciation detection is an important part of the Computer-Assisted Pronunciation Training System (CAPT). In this paper, we build a CNN-GRU-CTC acoustic model with Gated Recurrent Unit (GRU), convolutional neural network (CNN), and Connectionist Temporal Classification (CTC) technologies to study the pronunciation error detection of Tibetan students in Mandarin. This approach is end-to-end models, while phonemic or graphemic information, or forced alignment between different linguistic units, are not required. The experimental results show that the method proposed in this paper has achieved better experimental results and can effectively detect mispronunciations. The false rejection rate is 7.26%, the detection accuracy rate is 88.35%, and the combined error rate is 14.91%.
Related Papers
- → HMM-GMM based Amazigh speech recognition system(2020)2 cited
- → A preliminary exploration on tone error detection in Mandarin based on clustering(2010)1 cited
- → Performance of hybrid MMI-connectionist/HMM systems on the WSJ speech database(2002)1 cited
- A Discussion of the Role of “Mandarin Chinese” in the Quality-oriented Education(2003)
- → Text Independent Speaker Verficiation Using Dominant State Information of HMM-UBM(2015)