0 citations0 references

Exploring recognition network representations for efficient speech inference on highly parallel platforms

2010pp. 1489–1492

Citations Over TimeTop 10% of 2010 papers

Jike Chong, Ekaterina Gonina, Kisun You, Kurt Keutzer

Abstract

The emergence of highly parallel computing platforms is enabling new trade-offs in algorithm design for automatic speech recognition. It naturally motivates the following investigation: do the most computationally efficient sequential algorithms lead to the most computationally efficient parallel algorithms? In this paper we explore two contending recognition network representations for speech inference engines: the linear lexical model (LLM) and the weighted finite state transducer (WFST). We demonstrate that while an inference engine using the simpler LLM representation evaluates 22× more transitions per second than the advanced WFST representation, the simple structure of the LLM representation allows 4.7-6.4× faster evaluation and 53-65× faster operands gathering for each state transition. We use the 5k Wall Street Journal corpus to experiment on the NVIDIA GTX480 (Fermi) and the NVIDIA GTX285 Graphics Processing Units (GPUs), and illustrate that the performance of a speech inference engine based on the LLM representation is competitive with the WFST representation on highly parallel computing platforms.

Citations Over TimeTop 10% of 2010 papers

Abstract

Related Papers