0 citations0 references

Large vocabulary automatic speech recognition for children

2015pp. 1611–1615

Citations Over TimeTop 10% of 2015 papers

Hank Liao, Golan Pundak, Olivier Siohan, Melissa K. Carroll, Noah Coccaro, Qi-Ming Jiang, Tara N. Sainath, Andrew Senior, Françoise Beaufays, Michiel Bacchiani

Abstract

Recently, Google launched YouTube Kids, a mobile application for children, that uses a speech recognizer built specifically for recognizing children’s speech. In this paper we present techniques we explored to build such a system. We describe the use of a neural network classifier to identify matched acoustic training data, filtering data for language modeling to reduce the chance of producing offensive results. We also compare long short-term memory (LSTM) recurrent networks to convolutional, LSTM, deep neural networks (CLDNN). We found that a CLDNN acoustic model outperforms an LSTM across a variety of different conditions, but does not specifically model child speech relatively better than adult. Overall, these findings allow us to build a successful, state-of-the-art large vocabulary speech recognizer for both children and adults.

Related Papers

→ Automatic Labeling of Semantic Roles with a Dependency Parser in Hungarian Economic Texts(2015)1 cited
→ A Framework for Language Resource Construction and Syntactic Analysis: Case of Arabic(2018)1 cited
→ Morphological and Syntactic Processing for Text Retrieval(2004)8 cited
Syntactic Parsing based on Phrase Structure in Natural Language Processing(2009)
Exploiting the Translation Context for Multilingual WSD(2008)