Speech emotion recognition using deep neural network and extreme learning machine
Citations Over TimeTop 1% of 2014 papers
Abstract
Speech emotion recognition is a challenging problem partly because it is unclear what features are effective for the task. In this paper we propose to utilize deep neural networks (DNNs) to extract high level features from raw data and show that they are effective for speech emotion recognition. We first produce an emotion state probability distribution for each speech segment using DNNs. We then construct utterance-level features from segment-level probability distributions. These utterancelevel features are then fed into an extreme learning machine (ELM), a special simple and efficient single-hidden-layer neural network, to identify utterance-level emotions. The experimental results demonstrate that the proposed approach effectively learns emotional information from low-level features and leads to 20% relative accuracy improvement compared to the stateof-the-art approaches.
Related Papers
- → On-Line Sequential Extreme Learning Machine(2005)185 cited
- → Unsupervised extreme learning machine with representational features(2015)76 cited
- → Fault Diagnosis of Tennessee-Eastman Process Using Orthogonal Incremental Extreme Learning Machine Based on Driving Amount(2018)65 cited
- → Improved bidirectional extreme learning machine based on enhanced random search(2017)29 cited
- → An improved extreme learning machine with self-recurrent hidden layer(2022)15 cited