Storage-heterogeneity aware task-based programming models to optimize I/O intensive applications

Cita com:
hdl:2117/367747
Document typeArticle
Defense date2022-12-01
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
ProjectEXPERTISE - models, EXperiments and high PERformance computing for Turbine mechanical Integrity and Structural dynamics in Europe (EC-H2020-721865)
BSC - COMPUTACION DE ALTAS PRESTACIONES VIII (AEI-PID2019-107255GB-C21)
BSC - COMPUTACION DE ALTAS PRESTACIONES VIII (AEI-PID2019-107255GB-C21)
Abstract
Task-based programming models have enabled the optimized execution of the computation workloads of applications. These programming models can take advantage of large-scale distributed infrastructures by allowing the parallel and distributed execution of applications in high-level work components called tasks. Nevertheless, in the era of Big Data and Exascale, the amount of data produced by modern scientific applications has already surpassed terabytes and is rapidly increasing. Hence, I/O performance became the bottleneck to overcome in order to achieve more total performance improvement. New storage technologies offer higher bandwidth and faster solutions than traditional Parallel File Systems (PFS). Such storage devices are deployed in modern day infrastructures to boost I/O performance by offering a fast layer that absorbs the generated data. Therefore, it is necessary for any programming model targeting more performance to manage this heterogeneity and take advantage of it to improve the I/O performance of applications. Towards this goal, we propose in this paper a set of programming model capabilities that we refer to as Storage-Heterogeneity Awareness. Such capabilities include: (i) abstracting the heterogeneity of storage systems, and (ii) optimizing I/O performance by supporting dedicated I/O schedulers and an automatic data flushing technique. The evaluation section of this paper presents the performance results of different applications on the MareNostrum CTE-Power heterogeneous storage cluster. Our experiments demonstrate that a storage-heterogeneity aware programming model can achieve up to almost 5x I/O performance speedup and 48% total time improvement compared to the reference PFS-based usage of the execution infrastructure.
CitationElshazly, H.; Ejarque, J.; Badia, R.M. Storage-heterogeneity aware task-based programming models to optimize I/O intensive applications. "IEEE transactions on parallel and distributed systems", 2022, vol. 33, núm. 12, p. 3589-3599.
ISSN1045-9219
Publisher versionhttps://ieeexplore.ieee.org/document/9739916
Files | Description | Size | Format | View |
---|---|---|---|---|
Elshazly-Storage-Hetero-TPDS2022.pdf | 788,5Kb | View/Open |