A transparent runtime data distribution engine for OpenMP
Rights accessRestricted access - publisher's policy
This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the rate of remote memory accesses are critical for sustaining high performance on these systems. We show that due to the low remote-to-local memory access latency ratio of contemporary NUMA architectures, reasonably balanced page placement schemes, such as round-robin or random distribution, incur modest performance losses. Second, the paper presents a transparent, user-level page migration engine with an ability to gain back any performance loss that stems from suboptimal placement of pages in iterative OpenMP programs. The main body of the paper describes how our OpenMP runtime environment uses page migration for implementing implicit data distribution and redistribution schemes without programmer intervention. Our experimental results verify the effectiveness of the proposed framework and provide a proof of concept that it is not necessary to introduce data distribution directives in OpenMP and warrant the simplicity or the portability of the programming model.
CitationNikolopoulos, D., Papatheodorou, T., Polychronopoulos, C., Labarta, J., Ayguade, E. A transparent runtime data distribution engine for OpenMP. "Scientific programming", Juliol 2001, vol. 8, núm. 3, p. 143-162.
|SCIENTIFIC_PROGRAMMING_2000.pdf||A Transparent Runtime Data Distribution Engine for OpenMP||156.8Kb||Restricted access|