Implementation of H.264 decoder on general-purpose processors with media instructions
Citations Over TimeTop 10% of 2003 papers
Abstract
As emerging video coding standards, e.g. H.264, aim at high-quality video contents at low bit-rates, the encoding and decoding processes require much more computation than most existing standards do. This paper analyzes software implementation of a real-time H.264 decoder on general-purpose processors with media instructions. Specifically, we discuss how to optimize the speed of H.264 decoders on Intel Pentium 4 processors. This paper first analyzes the reference implementation to identify the time-consuming modules. Our study shows that a number of components, e.g., motion compensation and inverse integer transform, are the most time-consuming modules in the H.264 decoder. Second, we present a list of performance optimization methods using media instructions to improve the efficiency of these modules. After appropriate optimizations, the decoder speed improved by more than 3x---it can decode a 720×480 resolution video sequence at 48 frames per second on 2.4GHz Intel Pentium 4 processors compared to reference software's 12 frames per second. The optimization techniques demonstrated in this paper can also be applied to other video/image processing applications. Additionally, after presenting detailed application behavior on general-purpose processors, this paper discusses a few recommendations on how to design future efficient/powerful video/image applications/standards with given hardware implications.
Related Papers
- → Quality assessment of motion rendition in video coding(1999)37 cited
- → Video quality monitoring of streamed videos(2009)16 cited
- → An Efficient VLSI Architecture and Implementation of Motion Compensation for Video Decoder(2012)
- → A Universal Video Decoder for fully configurable video coding Digest of Technical Papers(2010)
- Design and Implementation of AVS-M Video Decoder(2005)