0 citations0 references

Simultaneous Speech-to-Speech Translation System with Transformer-Based Incremental ASR, MT, and TTS

2021Vol. 9, pp. 186–192

Citations Over Time

Ryo Fukuda, Sashi Novitasari, Yui Oka, Yasumasa Kano, Yuki Yano, Yuka Ko, Hirotaka Tokuyama, Kosuke Doi, Tomoya Yanagita, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

Abstract

In this paper, we present an English-to-Japanese simultaneous speech-to-speech translation (S2ST) system. It has three Transformer-based incremental processing modules for S2ST: automatic speech recognition (ASR), machine translation (MT), and text-to-speech synthesis (TTS). We also evaluated its system-level latency in addition to the module-level latency and accuracy.

Related Papers

→ The Kyoto Speech-to-Speech Translation System for IWSLT 2023(2023)2 cited
Ogmios: The UPC Text-to-Speech synthesis system for Spoken Translation(2006)
→ Voice Signal Processing For Speech Synthesis(2006)17 cited
→ Digital speech processing : speech coding, synthesis, and recognition(1992)9 cited
→ Statistical vowelization of Arabic text for speech synthesis in speech-to-speech translation systems(2007)1 cited