Performance impact of unaligned memory operations in SIMD extensions for video CODEC applications
Document typeConference report
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
Although SIMD extensions are a cost effective way to exploit the data level parallelism present in most media applications, we will show that they had have a very limited memory architecture with a weak support for unaligned memory accesses. In video codec, and other applications, the overhead for accessing unaligned positions without an efficient architecture support has a big performance penalty and in some cases makes vectorization counter-productive. In this paper we analyze the performance impact of extending the Altivec SIMD ISA with unaligned memory operations. Results show that for several kernels in the H.264/AVC media codec, unaligned access support provides a speedup up to 3.8times compared to the plain SIMD version, translating into an average of 1.2times in the entire application. In addition to providing a significant performance advantage, the use of unaligned memory instructions makes programming SIMD code much easier both for the manual developer and the auto vectorizing compiler
CitationÁlvarez, M., Salamí, E., Ramírez, A., Valero, M. Performance impact of unaligned memory operations in SIMD extensions for video CODEC applications. A: IEEE International Symposium on Performance Analysis of Systems and Software. "ISPASS 2007: IEEE International Symposium on Performance Analysis of Systems And Software: April 25-27, 2007, San Jose, CA, USA". San Jose, CA: Institute of Electrical and Electronics Engineers (IEEE), 2007, p. 62-71.