Emergence of multimodal action representations from neural network self-organization
Citations Over TimeTop 10% of 2016 papers
Abstract
The integration of multisensory information plays a crucial role in autonomous robotics to forming robust and meaningful representations of the environment. In this work, we investigate how robust multimodal representations can naturally develop in a self-organizing manner from co-occurring multisensory inputs. We propose a hierarchical architecture with growing self-organizing neural networks for learning human actions from audiovisual inputs. The hierarchical processing of visual inputs allows to obtain progressively specialized neurons encoding latent spatiotemporal dynamics of the input, consistent with neurophysiological evidence for increasingly large temporal receptive windows in the human cortex. Associative links to bind unimodal representations are incrementally learned by a semi-supervised algorithm with bidirectional connectivity. Multimodal representations of actions are obtained using the co-activation of action features from video sequences and labels from automatic speech recognition. Experimental results on a dataset of 10 full-body actions show that our system achieves state-of-the-art classification performance without requiring the manual segmentation of training samples, and that congruent visual representations can be retrieved from recognized speech in the absence of visual stimuli. Together, these results show that our hierarchical neural architecture accounts for the development of robust multimodal representations from dynamic audiovisual inputs.
Related Papers
- → ON THE ORGANIZATION OF RECEPTIVE FIELDS OF ORIENTATION-SELECTIVE UNITS RECORDED IN THE FISH TECTUM(2009)18 cited
- → Biologically Inspired Bayes Learning and Its Dependence on the Distribution of the Receptive Fields(2006)2 cited
- → Rabbit visual cortical neurons with simple and complex receptive fields(1978)1 cited
- Susquehanna Chorale Spring Concert "Roots and Wings"(2017)
- Investigation into Sub-Receptive Fields of Retinal Ganglion Cells with Natural Images(2018)