Age Estimation in Short Speech Utterances Based on LSTM Recurrent Neural Networks
Citations Over TimeTop 10% of 2018 papers
Abstract
Age estimation from speech has recently received increased interest as it is useful for many applications such as user-profiling, targeted marketing, or personalized call-routing. This kind of applications need to quickly estimate the age of the speaker and might greatly benefit from real-time capabilities. Long short-term memory (LSTM) recurrent neural networks (RNN) have shown to outperform state-of-the-art approaches in related speech-based tasks, such as language identification or voice activity detection, especially when an accurate real-time response is required. In this paper, we propose a novel age estimation system based on LSTM-RNNs. This system is able to deal with short utterances (from 3 to 10 s) and it can be easily deployed in a real-time architecture. The proposed system has been tested and compared with a state-of-the-art i-vector approach using data from NIST speaker recognition evaluation 2008 and 2010 data sets. Experiments on short duration utterances show a relative improvement up to 28% in terms of mean absolute error of this new approach over the baseline system.
Related Papers
- → The ELISA consortium approaches in speaker segmentation during the NIST 2002 speaker recognition evaluation(2003)33 cited
- → Artificial Intelligence for Sport Actions and Performance Analysis using Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM)(2018)10 cited
- → Understanding LSTM -- a tutorial into Long Short-Term Memory Recurrent\n Neural Networks(2019)497 cited
- → Accident Detection System Based on RNN Exploiting Keypoints and LSTM(2023)
- → Hoax Identification On Social Media Using Recurrent Neural Network (RNN) And Long Short-term Memory (LSTM) Methods(2023)