An unified parallel C compiler that implements automatic communication aggregation
Document typeConference report
Rights accessRestricted access - publisher's policy
Partitioned Global Address Space (PGAS) programming languages, such as Unified Parallel C (UPC), offer an attractive high-productivity programming model for programming large-scale parallel machines. PGAS languages partition the application’s address space into private, shared-local and shared-remote memory. When running in a distributed-memory environment, accessing shared-remote memory leads to implicit communication. For fine-grained accesses, which are frequently found in UPC programs, this communication overhead can significantly impact program performance. One solution for reducing the number of fine-grained accesses is to coalesce several accesses into a single access. This paper presents an analysis to identify opportunities for coalescing and an algorithm that allows the compiler to automatically coalesce accesses to shared-remote memory in UPC. It also describes how opportunities for coalescing can be created by the compiler through loop unrolling. Results obtained from coalescing accesses in manually-unrolled parallel loops are presented to demonstrate the benefit of combining parallel loop unrolling and communication coalescing.
CitationBarton, C. [et al.]. An unified parallel C compiler that implements automatic communication aggregation. A: Workshop on Compilers for Parallel Computing. "CPC 2009: 14th Workshop on Compilers for Parallel Computing: January 7-9, 2009: IBM Research Center, Zurich, Switzerland". Zurich: 2009.
|p27_barton.pdf||Article CPC09||272.7Kb||Restricted access|