Audio-Visual Affect Recognition through Multi-Stream Fused HMM for HCI
Citations Over TimeTop 10% of 2005 papers
Abstract
Advances in computer processing power and emerging algorithms are allowing new ways of envisioning human computer interaction. This paper focuses on the development of a computing algorithm that uses audio and visual sensors to detect and track a user's affective state to aid computer decision making. Using our multi-stream fused hidden Markov model (MFHMM), we analyzed coupled audio and visual streams to detect 11 cognitive/emotive states. The MFHMM allows the building of an optimal connection among multiple streams according to the maximum entropy principle and the maximum mutual information criterion. Person-independent experimental results from 20 subjects in 660 sequences show that the MFHMM approach performs with an accuracy of 80.61% which outperforms face-only HMM, pitch-only HMM, energy-only HMM, and independent HMM fusion.
Related Papers
- → Development of HMM Based Automatic Speech Recognition System for Indian English(2018)3 cited
- → Speech Recognition System and Isolated Word Recognition based on Hidden Markov Model (HMM) for Hearing Impaired(2013)14 cited
- → Text-dependent speaker identification using hidden Markov model with stress compensation technique(2002)11 cited
- → Hidden Markov models (HMMs) isolated word recognizer with the optimization of acoustical analysis and modeling techniques(2011)2 cited
- → Stressed speech recognition using multi-dimensional hidden Markov models(2002)1 cited