An unified parallel C compiler that implements automatic communication aggregation
Tipus de documentText en actes de congrés
Condicions d'accésAccés restringit per política de l'editorial
Partitioned Global Address Space (PGAS) programming languages, such as Unified Parallel C (UPC), offer an attractive high-productivity programming model for programming large-scale parallel machines. PGAS languages partition the application’s address space into private, shared-local and shared-remote memory. When running in a distributed-memory environment, accessing shared-remote memory leads to implicit communication. For fine-grained accesses, which are frequently found in UPC programs, this communication overhead can significantly impact program performance. One solution for reducing the number of fine-grained accesses is to coalesce several accesses into a single access. This paper presents an analysis to identify opportunities for coalescing and an algorithm that allows the compiler to automatically coalesce accesses to shared-remote memory in UPC. It also describes how opportunities for coalescing can be created by the compiler through loop unrolling. Results obtained from coalescing accesses in manually-unrolled parallel loops are presented to demonstrate the benefit of combining parallel loop unrolling and communication coalescing.
CitacióBarton, C. [et al.]. An unified parallel C compiler that implements automatic communication aggregation. A: Workshop on Compilers for Parallel Computing. "CPC 2009: 14th Workshop on Compilers for Parallel Computing: January 7-9, 2009: IBM Research Center, Zurich, Switzerland". Zurich: 2009.
|p27_barton.pdf||Article CPC09||272.7Kb||Accés restringit|