Learning to disentangle emotion factors for facial expression recognition in the wild
Citations Over TimeTop 12% of 2021 papers
Abstract
Facial expression recognition (FER) in the wild is a very challenging problem due to different expressions under complex scenario (e.g., large head pose, illumination variation, occlusions, etc.), leading to suboptimal FER performance. Accuracy in FER heavily relies on discovering superior discriminative, emotion-related features. In this paper, we propose an end-to-end module to disentangle latent emotion discriminative factors from the complex factors variables for FER to obtain salient emotion features. The training of proposed method contains two stages. First of all, emotion samples are used to obtain the latent representation using a variational auto-encoder with reconstruction penalization. Furthermore, the latent representation as the input is thrown into a disentangling layer to learn a set of discriminative emotion factors through the attention mechanism (e.g., a Squeeze-and-Excitation block) that encourages to separate emotion-related factors and nonaffective factors. Experimental results on public benchmark databases (RAF-DB and FER2013) show that our approach has remarkable performance in complex scenes than current state-of-the-art methods.
Related Papers
- → Performance Comparison of Three Types of Autoencoder Neural Networks(2008)28 cited
- → The Learning Effect of Different Hidden Layers Stacked Autoencoder(2016)20 cited
- → Combining an Autoencoder and a Variational Autoencoder for Explaining the Machine Learning Model Predictions(2021)5 cited
- → Autoencoder: An Unsupervised Deep Learning Approach(2022)4 cited
- → A Comparative Evaluation of AutoEncoder-Based Unsupervised Anomaly Detection Methods Applied on Space Payload(2020)