In the last years, there has been much effort in commercial compilers to generate efficient SIMD instructions-based code sequences from conventional sequential programs. However, the small numbers of compilers that can automatically use these
instructions achieve in most cases unsatisfactory results. Therefore, the code often has to be written manually in assembly language or using compiler built-in functions to achieve high performance. In this work, we present source-to-source transformations that help commercial vectorizing compilers to generate efficient SIMD code. Experimental results show that excellent performance can be achieved. In particular, for the problem of matrix product (SGEMM) we almost achieve as high performance as hand-optimized numerical libraries. Our source-tosource transformations are based on the scalar replacement and unroll and jam transformations presented by Callahan et all. In particular, we extend the use of scalar replacement to vectorial replacement and combine this transformation with unroll and jam and outer loop vectorization to fully exploit the vector register level and thus to help the compiler to generate efficient SIMD code. We will show experimentally the effectiveness of our proposal.
CitationBerna, A.; Jimenez, M.; Llaberia, J. Source-to-Source transformations for efficient SIMD code generation. A: Jornadas de Paralelismo. "Actas de las XXII Jornadas de Paralelismo". La Laguna, Tenerife: 2011, p. 719-726.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder. If you wish to make any use of the work not provided for in the law, please contact: firstname.lastname@example.org