0 citations0 references

Wav2Letter++: A Fast Open-source Speech Recognition System

2019pp. 6460–6464

Citations Over TimeTop 1% of 2019 papers

Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert

Abstract

This paper introduces wav2letter++, a fast open-source deep learning speech recognition framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for maximum efficiency. We explain the architecture and design of the wav2letter++ system and compare it to other major open-source speech recognition systems. In some cases wav2letter++ is more than 2× faster than other optimized frameworks for training end-to-end neural networks for speech recognition. We also show that wav2letter++ training times scale linearly to 64 GPUs, the most we tested, for models with 100 million parameters. High-performance frameworks enable fast iteration, which is often a crucial factor in successful research and model tuning on new datasets and tasks.

Related Papers

Making Open Source Ready for the Enterprise: The Open Source Maturity Model(2008)
→ Crack open the source(2008)1 cited
Intelligence Analysis System Based on Open Source Information(2009)
→ W. HODGES'S VIEWS ON THE CAVERNS AS THE ORIGIN OF ARCHITECTURE : On W. Hodges's "A Dissertation on the Protptypes of Architecture, Hindoo, Moorish, Gothic"(2005)
→ Open Source is Good for AI But, Is AI Good for Open Source?(2023)