Show simple item record

dc.contributor.authorRadojkovic, Petar
dc.contributor.authorMarazakis, Manolis
dc.contributor.authorCarpenter, Paul Matthew
dc.contributor.authorJeyapaul, Reiley
dc.contributor.authorGizopoulos, Dimitris
dc.contributor.authorSchulz, Martin
dc.contributor.authorArmejach Sanosa, Adrià
dc.contributor.authorAyguadé Parra, Eduard
dc.contributor.authorCanal Corretger, Ramon
dc.contributor.authorMoreto Planas, Miquel
dc.contributor.authorSalami, Behzad
dc.contributor.authorUnsal, Osman Sabri
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
dc.contributor.otherBarcelona Supercomputing Center
dc.identifier.citationRadojkovic, P. [et al.]. Towards resilient EU HPC systems: A blueprint. 2020.
dc.description.abstractThis document aims to spearhead a Europe-wide discussion on HPC system resilience and to help the European HPC community define best practices for resilience. We analyse a wide range of state-of-the-art resilience mechanisms and recommend the most effective approaches to employ in large-scale HPC systems. Our guidelines will be useful in the allocation of available resources, as well as guiding researchers and research funding towards the enhancement of resilience approaches with the highest priority and utility. Although our work is focused on the needs of next generation HPC systems in Europe, the principles and evaluations are applicable globally.
dc.description.sponsorshipThis work has received funding from the European Union’s Horizon 2020 research and innovation programme under the projects ECOSCALE (grant agreement No 671632), EPI (grant agreement No 826647), EuroEXA (grant agreement No 754337), Eurolab4HPC (grant agreement No 800962), EVOLVE (grant agreement No 825061), EXA2PRO (grant agreement No 801015), ExaNest (grant agreement No 671553), ExaNoDe (grant agreement No 671578), EXDCI-2 (grant agreement No 800957), LEGaTO (grant agreement No 780681), MB2020 (grant agreement No 779877), RECIPE (grant agreement No 801137) and SDK4ED (grant agreement No 780572). The work was also supported by the European Commission’s Seventh Framework Programme under the projects CLERECO (grant agreement No 611404), the NCSA-Inria-ANL-BSC-JSCRiken-UTK Joint-Laboratory for Extreme Scale Computing – JLESC (, OMPI-X project (No ECP- and the Spanish Government through Severo Ochoa programme (SEV-2015-0493). This work was sponsored in part by the U.S. Department of Energy's Office of Advanced Scientific Computing Research, program managers Robinson Pino and Lucy Nowell. This manuscript has been authored by UT-Battelle, LLC under Contract No DE-AC05-00OR22725 with the U.S. Department of Energy.
dc.format.extent30 p.
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshHigh performance computing -- Europe
dc.titleTowards resilient EU HPC systems: A blueprint
dc.typeExternal research report
dc.subject.lemacCàlcul intensiu (Informàtica) -- Europa
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.contributor.groupUniversitat Politècnica de Catalunya. VIRTUOS - Virtualisation and Operating Systems
dc.rights.accessOpen Access
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/FP7/611404/EU/Cross-Layer Early Reliability Evaluation for the Computing cOntinuum/CLERECO
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/801137/EU/REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems/RECIPE
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/800962/EU/Consolidation of European Research Excellence in Exascale HPC Systems/EUROLAB4HPC2
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/826647/EU/SGA1 (Specific Grant Agreement 1) OF THE EUROPEAN PROCESSOR INITIATIVE (EPI)/EPI SGA1
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/671578/EU/European Exascale Processor Memory Node Design/ExaNoDe
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/780681/EU/Low Energy Toolset for Heterogeneous Computing/LEGaTO
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/779877/EU/Mont-Blanc 2020, European scalable, modular and power efficient HPC processor/Mont-Blanc 2020
local.citation.authorRadojkovic, P.; Marazakis, M.; Carpenter, P.; Jeyapaul, R.; Gizopoulos, D.; Schulz, M.; Armejach, A.; Ayguadé, E.; Canal, R.; Moreto, M.; Salami, B.; Unsal, O.

Files in this item


This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder