Towards textually describing complex video contents with audio-visual concept classifiers
Citations Over TimeTop 10% of 2011 papers
Abstract
Automatically generating compact textual descriptions of complex video contents has wide applications. With the recent advancements in automatic audio-visual content recognition, in this paper we explore the technical feasibility of the challenging issue of precisely recounting video contents. Based on cutting-edge automatic recognition techniques, we start from classifying a variety of visual and audio concepts in video contents. According to the classification results, we apply simple rule-based methods to generate textual descriptions of video contents. Results are evaluated by conducting carefully designed user studies. We find that the state-of-the-art visual and audio concept classification, although far from perfect, is able to provide very useful clues indicating what is happening in the videos. Most users involved in the evaluation confirmed the informativeness of our machine-generated descriptions.
Related Papers
- → New variety or learner English?(2007)65 cited
- A measure for audio-visual programs in schools : prepared for the Committee on Visual Aids in Education(1944)
- A comparative cost-effectiveness study of quotlow-cost audio-visual teaching aids and quothigh-costquot audio-visual teaching aids(1984)
- CAI Yuan-pei:The Promoter of China Early Audio-visual Education(2011)