Training set issues in SRI's DECIPHER speech recognition system
Citations Over TimeTop 10% of 1990 papers
Abstract
SRI has developed the DECIPHER system, a hidden Markov model (HMM) based continuous speech recognition system typically used in a speaker-independent manner. Initially we review the DECIPHER system, then we show that DECIPHER's speaker-independent performance improved by 20% when the standard 3990-sentence speaker-independent test set was augmented with training data from the 7200-sentence resource management speaker-dependent training sentences. We show a further improvement of over 20% when a version of corrective training was implemented. Finally we show improvement using parallel male- and female-trained models in DECIPHER. The word-error rate when all three improvements were combined was 3.7% on DARPA's February 1989 speaker-independent test set using the standard perplexity 60 wordpair grammar.
Related Papers
- → Topic change and local perplexity in spoken legal dialogue(2002)3 cited
- → An Approach to Estimate Perplexity Values for Language Models Based on Phrase Classes(2009)
- Investigation and Analysis of College Librarians′Occupation Perplexity——Taking Ten University libraries in Chongqing as Examples(2012)
- → Example Perplexity(2022)