Asynchronous and exact forward recovery for detected errors in iterative solvers
Visualitza/Obre
Cita com:
hdl:2117/118042
Tipus de documentArticle
Data publicació2018-03-19
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
ProjecteCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
ROMOL - Riding on Moore's Law (EC-FP7-321253)
ROMOL - Riding on Moore's Law (EC-FP7-321253)
Abstract
Current trends and projections show that faults in computer systems become increasingly common. Such errors may be detected, and possibly corrected transparently, e.g. by Error Correcting Codes (ECC). For a program to be fault-tolerant, it needs to also handle the Errors that are Detected and Uncorrected (DUE), such as an ECC encountering too many bit flips in a codeword. While correcting an error has an overhead in itself, it can also affect the progress of a program. The most generic technique, rolling back the program state to a previously taken checkpoint, sets back any progress done since then. Alternately, application specific techniques exist, such as restarting an iterative program with its latest iteration's values as initial guess.
Descripció
© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
CitacióJaulmes, L., Casas, M., Moreto, M., Ayguadé, E., Labarta, J., Valero, M. Asynchronous and exact forward recovery for detected errors in iterative solvers. "IEEE transactions on parallel and distributed systems", 19 Març 2018, vol. 29, núm. 9, p. 1961-1974
ISSN1045-9219
Versió de l'editorhttps://ieeexplore.ieee.org/document/8320336/
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
08320336.pdf | 1,566Mb | Visualitza/Obre |