Acoustic models of the elderly for large‐vocabulary continuous speech recognition | doi.page

0 citations0 references

Acoustic models of the elderly for large‐vocabulary continuous speech recognition

Electronics and Communications in Japan (Part II Electronics)2004Vol. 87(7), pp. 49–57

Citations Over TimeTop 19% of 2004 papers

Akira Baba, Shinichi Yoshizawa, Miichi Yamada, Akinobu Lee, Kiyohiro Shikano

Abstract

Abstract Widespread use of large‐vocabulary continuous speech recognition systems has recently occurred, encouraging the application of speech recognition techniques to various problems. One of the factors that adversely affect the performance of speech recognition systems is a mismatch between the acoustic properties of the speech of the system user and the acoustic model. The speech of young or middle‐aged adults is generally used in constructing the acoustic model. Thus, a mismatch occurs between the model and the acoustic properties of the speech of the elderly, which may degrade the recognition rate. In this study, a large‐scale elderly speech database (200 sentences ×301 subjects) is used to train the acoustic model, and the resulting elderly acoustic model is evaluated by using a large‐vocabulary continuous speech recognition system. In the experiments, the word recognition rate was improved by 3 to 5% compared to the recognition results of an acoustic model trained by young or middle‐aged adult speech, namely, by the JNAS speech database (150 sentences ×260 subjects, average 28.6 years). It is also verified experimentally that the recognition rate is further improved in speaker adaptation to elderly speech by making use of an acoustic model trained by elderly speech. © 2004 Wiley Periodicals, Inc. Electron Comm Jpn Pt 2, 87(7): 49–57, 2004; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/ecjb.20101

Citations Over TimeTop 19% of 2004 papers

Abstract

Related Papers