Understanding memory access patterns using the BSC performance tools

Servat, Harald; Labarta Mancho, Jesús José; Hoppe, Hans-Christian; Giménez, Judit; Peña, Antonio J.

doi:10.1016/j.parco.2018.06.007

dc.contributor.author	Servat, Harald
dc.contributor.author	Labarta Mancho, Jesús José
dc.contributor.author	Hoppe, Hans-Christian
dc.contributor.author	Giménez, Judit
dc.contributor.author	Peña, Antonio J.
dc.contributor.other	Barcelona Supercomputing Center
dc.date.accessioned	2018-07-24T10:16:13Z
dc.date.available	2020-07-09T00:26:18Z
dc.date.issued	2018-10
dc.identifier.citation	Servat, H. [et al.]. Understanding memory access patterns using the BSC performance tools. "Parallel Computing", Octubre 2018, vol. 78, p. 1-14.
dc.identifier.issn	0167-8191
dc.identifier.other	https://arxiv.org/abs/2005.05872
dc.identifier.uri	http://hdl.handle.net/2117/119839
dc.description.abstract	The growing gap between processor and memory speeds has lead to complex memory hierarchies as processors evolve to mitigate such divergence by exploiting the locality of reference. In this direction, the BSC performance analysis tools have been recently extended to provide insight into the application memory accesses by depicting their temporal and spatial characteristics, correlating with the source-code and the achieved performance simultaneously. These extensions rely on the Precise Event-Based Sampling (PEBS) mechanism available in recent Intel processors to capture information regarding the application memory accesses. The sampled information is later combined with the Folding technique to represent a detailed temporal evolution of the memory accesses and in conjunction with the achieved performance and the source-code counterpart. The reports generated by the latter tool help not only application developers but also processor architects to understand better how the application behaves and how the system performs. In this paper, we describe a tighter integration of the sampling mechanism into the monitoring package. We also demonstrate the value of the complete workflow by exploring already optimized state–of–the–art benchmarks, providing detailed insight of their memory access behavior. We have taken advantage of this insight to apply small modifications that improve the applications’ performance.
dc.description.sponsorship	This work has been performed in the Intel-BSC Exascale Lab. We would like to thank Forschungszentrum Jülich for the compute time on the Jureca system. This project has received funding from the European Union’s Horizon 2020 research and innovation program under Marie Sklodowska-Curie grant agreement no. 749516.
dc.format.extent	14 p.
dc.language.iso	eng
dc.publisher	Elsevier
dc.subject	Àrees temàtiques de la UPC::Informàtica
dc.subject.lcsh	High performance computing
dc.subject.other	Performance analysis
dc.subject.other	Memory references
dc.subject.other	Sampling
dc.subject.other	Instrumentation
dc.title	Understanding memory access patterns using the BSC performance tools
dc.type	Article
dc.subject.lemac	Supercomputadors
dc.identifier.doi	10.1016/j.parco.2018.06.007
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	https://www.sciencedirect.com/science/article/pii/S0167819118301911
dc.rights.access	Open Access
local.identifier.drac	23358637
dc.description.version	Postprint (author's final draft)
dc.relation.projectid	info:eu-repo/grantAgreement/EC/H2020/749516/EU/Advanced Ecosystem for Broad Heterogeneous Memory Usage/ECO-H-MEM
local.citation.publicationName	Parallel Computing
local.citation.volume	78
local.citation.startingPage	1
local.citation.endingPage	14

Fitxers d'aquest items

Nom:: memory_access_patterns.pdf
Mida:: 1,630Mb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Articles de revista [318]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Understanding memory access patterns using the BSC performance tools

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora