Towards acoustic model unification across dialects
Citations Over TimeTop 10% of 2016 papers
Abstract
Acoustic model performance typically decreases when evaluated on a dialectal variation of the same language that was not used during training. Similarly, models simultaneously trained on a group of dialects tend to underperform dialect-specific models. In this paper, we report on our efforts towards building a unified acoustic model that can serve a multi-dialectal language. Two techniques are presented: Distillation and MultiTask Learning (MTL). In Distillation, we use an ensemble of dialect-specific acoustic models and distill its knowledge in a single model. In MTL, we utilize multitask learning to train a unified acoustic model that learns to distinguish dialects as a side task. We show that both techniques are superior to the jointly-trained model that is trained on all dialectal data, reducing word error rates by 4:2% and 0:6%, respectively. While achieving this improvement, neither technique degrades the performance of the dialect-specific models by more than 3:4%.
Related Papers
- → Capitalization Normalization for Language Modeling with an Accurate and Efficient Hierarchical RNN Model(2022)7 cited
- → Improved language modelling through better language model evaluation measures(2001)11 cited
- → Speech recognition experiments using multi-span statistical language models(1999)6 cited
- → Deep Learning Based Language Modeling for Domain-Specific Speech Recognition(2017)1 cited
- → Large Scale Language Modeling in Automatic Speech Recognition(2012)37 cited