Runtime vs. manual data distribution for architecture-agnostic shared-memory programming models

Nikolopoulos, Dimitrios; Ayguadé Parra, Eduard; Polychronopoulos, C D

doi:10.1023/A:1019899812171

dc.contributor.author	Nikolopoulos, Dimitrios
dc.contributor.author	Ayguadé Parra, Eduard
dc.contributor.author	Polychronopoulos, C D
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned	2018-05-16T14:36:17Z
dc.date.issued	2002-08
dc.identifier.citation	Nikolopoulos, D., Ayguade, E., Polychronopoulos, C. Runtime vs. manual data distribution for architecture-agnostic shared-memory programming models. "Journal of parallel and distributed computing", Agost 2002, vol. 30, núm. 4, p. 225-254.
dc.identifier.issn	0743-7315
dc.identifier.uri	http://hdl.handle.net/2117/117287
dc.description.abstract	This paper compares data distribution methodologies for scaling the performance of OpenMP on NUMA architectures. We investigate the performance of automatic page placement algorithms implemented in the operating system, runtime algorithms based on dynamic page migration, runtime algorithms based on loop scheduling transformations and manual data distribution. These techniques present the programmer with trade-offs between performance and programming effort. Automatic page placement algorithms are transparent to the programmer, but may compromise memory access locality. Dynamic page migration algorithms are also transparent, but require careful engineering and tuned implementations to be effective. Manual data distribution requires substantial programming effort and architecture-specific extensions to the API, but may localize memory accesses in a nearly optimal manner. Loop scheduling transformations may or may not require intervention from the programmer, but conform better to an architecture-agnostic programming paradigm like OpenMP. We identify the conditions under which runtime data distribution algorithms can optimize memory access locality in OpenMP. We also present two novel runtime data distribution techniques, one based on memory access traces and another based on affinity scheduling of parallel loops. These techniques can be used to effectively replace manual data distribution in regular applications. The results provide a proof of concept that it is possible to scale a portable shared-memory programming model up to more than 100 processors, without modifying the API and without exposing architectural details to the programmer.
dc.format.extent	30 p.
dc.language.iso	eng
dc.subject	Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
dc.subject.lcsh	Parallel processing (Electronic computers)
dc.subject.other	Data distribution
dc.subject.other	Operating systems
dc.subject.other	Runtime systems
dc.subject.other	Performance evaluation
dc.subject.other	OpenMP
dc.title	Runtime vs. manual data distribution for architecture-agnostic shared-memory programming models
dc.type	Article
dc.subject.lemac	Processament en paral·lel (Ordinadors)
dc.contributor.group	Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi	10.1023/A:1019899812171
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	http://link.springer.com/article/10.1023%2FA%3A1019899812171
dc.rights.access	Restricted access - publisher's policy
local.identifier.drac	1642820
dc.description.version	Postprint (published version)
dc.date.lift	10000-01-01
local.citation.author	Nikolopoulos, D.; Ayguade, E.; Polychronopoulos, C.
local.citation.publicationName	Journal of parallel and distributed computing
local.citation.volume	30
local.citation.number	4
local.citation.startingPage	225
local.citation.endingPage	254

Fitxers d'aquest items

Nom:: Runtime vs. Manual Data Distri ...
Mida:: 507,1Kb
Format:: PDF
Descripció:: Runtime vs. Manual Data Distri ...

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Articles de revista [1.049]
Articles de revista [382]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Runtime vs. manual data distribution for architecture-agnostic shared-memory programming models

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora