0 citations0 references

Recent innovations in speech-to-text transcription at SRI-ICSI-UW

IEEE Transactions on Audio Speech and Language Processing2006Vol. 14(5), pp. 1729–1744

Citations Over TimeTop 1% of 2006 papers

Andreas Stolcke, Barry Chen, Horacio Franco, Venkata Ramana Rao Gadde, Martin Graciarena, Mei-Yuh Hwang, Katrin Kirchhoff, Arindam Mandal, N. Morgan, Xin Lei, T. Ng, Mari Ostendorf, Kemal Sönmez, Anand Venkataraman, Dimitra Vergyri, Wen Wang, Jing Zheng, Qifeng Zhu

Abstract

We summarize recent progress in automatic speech-to-text transcription at SRI, ICSI, and the University of Washington. The work encompasses all components of speech modeling found in a state-of-the-art recognition system, from acoustic features, to acoustic modeling and adaptation, to language modeling. In the front end, we experimented with nonstandard features, including various measures of voicing, discriminative phone posterior features estimated by multilayer perceptrons, and a novel phone-level macro-averaging for cepstral normalization. Acoustic modeling was improved with combinations of front ends operating at multiple frame rates, as well as by modifications to the standard methods for discriminative Gaussian estimation. We show that acoustic adaptation can be improved by predicting the optimal regression class complexity for a given speaker. Language modeling innovations include the use of a syntax-motivated almost-parsing language model, as well as principled vocabulary-selection techniques. Finally, we address portability issues, such as the use of imperfect training transcripts, and language-specific adjustments required for recognition of Arabic and Mandarin

Related Papers

→ An Improved Mandarin Keyword Spotting System Using MCE Training and Context-Enhanced Verification(2006)8 cited
→ Deep Learning Based Language Modeling for Domain-Specific Speech Recognition(2017)1 cited
→ Large Scale Language Modeling in Automatic Speech Recognition(2012)37 cited
→ Keyword Spotting using Keyword Adapted Language Model(2007)
Investigation of Subwords Confidence Performance in Chinese Speech Verification(2008)