Exploració per tema "Check pointing"
Ara es mostren els items 1-1 de 1
-
Programmer-directed partial redundancy for resilient HPC
(Association for Computing Machinery (ACM), 2015)
Text en actes de congrés
Accés restringit per política de l'editorialIn this work we propose partial task replication and check-pointing for task-parallel HPC applications to mitigate silent data corruption (SDC) errors. As the complete replication of all application tasks can be prohibitive ...