Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
Citations Over TimeTop 12% of 2019 papers
Abstract
A method (500) includes receiving audio data encoding an utterance (106) spoken by a native speaker (110) of a first language, and receiving a biasing term list (105) including one or more terms in a second language different than the first language. The method also includes processing, using a speech recognition model (200), acoustic features (105) derived from the audio data to generate speech recognition scores for both wordpieces and corresponding phoneme sequences in the first language. The method also includes rescoring the speech recognition scores for the phoneme sequences based on the one or more terms in the biasing term list, and executing, using the speech recognition scores for the wordpieces and the rescored speech recognition scores for the phoneme sequences, a decoding graph (400) to generate a transcription (116) for the utterance.
Related Papers
- → Contextualization That is Comprehensive(2006)8 cited
- Worldview, Challenge of Contextualization and Church Planting in West Africa – Part 1: Definition of Worldview and the Historical Development of the Concept(2010)
- The Trinity and Contextualization(2010)
- → Research on Influence of Contextualization on Difficulty of Test Questions(2019)
- → Contextualization Mission of Paul in the Book of Acts(2021)