T-GSA: Transformer with Gaussian-Weighted Self-Attention for Speech Enhancement
Citations Over TimeTop 1% of 2020 papers
Abstract
Transformer neural networks (TNN) demonstrated state-ofart performance on many natural language processing (NLP) tasks, replacing recurrent neural networks (RNNs), such as LSTMs or GRUs. However, TNNs did not perform well in speech enhancement, whose contextual nature is different than NLP tasks, like machine translation. Self-attention is a core building block of the Transformer, which not only enables parallelization of sequence computation, but also provides the constant path length between symbols that is essential to learning long-range dependencies. In this paper, we propose a Transformer with Gaussian-weighted self-attention (T-GSA), whose attention weights are attenuated according to the distance between target and context symbols. The experimental results show that the proposed T-GSA has significantly improved speech-enhancement performance, compared to the Transformer and RNNs.
Related Papers
- → Training Deeper Neural Machine Translation Models with Transparent Attention(2018)144 cited
- → Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder(2019)65 cited
- → Neural Machine Translation with the Transformer and Multi-Source Romance Languages for the Biomedical WMT 2018 task(2018)15 cited
- → Searching Better Architectures for Neural Machine Translation(2020)27 cited
- → Incorporating Pre-trained Model into Neural Machine Translation(2021)2 cited