Mostra el registre d'ítem simple

dc.contributor.authorSubasi, Omer
dc.contributor.authorUnsal, Osman Sabri
dc.contributor.authorLabarta Mancho, Jesús José
dc.contributor.authorYalcin, Gulay
dc.contributor.authorCristal Kestelman, Adrián
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2016-11-02T13:22:39Z
dc.date.issued2016
dc.identifier.citationSubasi, O., Unsal, O., Labarta, J., Yalcin, G., Cristal, A. CRC-based memory reliability for task-parallel HPC applications. A: IEEE International Symposium on Parallel and Distributed Processing. "2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS 2016): Chicago, Illinois, USA: 23-27 May 2016". Chicago, Illinois: Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 1101-1112.
dc.identifier.isbn9781509021413
dc.identifier.urihttp://hdl.handle.net/2117/91341
dc.description.abstractMemory reliability will be one of the major concerns for future HPC and Exascale systems. This concern is mostly attributed to the expected massive increase in memory capacity and the number of memory devices in Exascale systems. For memory systems Error Correcting Codes (ECC) are the mostcommonly used mechanism. However state-of-the art hardware ECCs will not be sufficient in terms of error coverage for future computing systems and stronger hardware ECCs providing more coverage have prohibitive costs in terms of area, power and latency. Software-based solutions are needed to cooperate with hardware. In this work, we propose a Cyclic Redundancy Checks (CRCs) based software mechanism for task-parallel HPC applications. Our mechanism incurs only 1.7% performance overheadwith hardware acceleration while being highly scalable at large scale. Our mathematical analysis demonstrates the effectiveness of our scheme and its error coverage. Results show that our CRC-based mechanism reduces the memory vulnerability by 87% on average with up to 32-bit burst (consecutive) and 5-bit arbitrary error correction capability.
dc.description.sponsorshipThis work was supported by FI-DGR 2013 scholarship and the European Community’s Seventh Framework Programme [FP7/2007-2013] under the Mont-blanc 2 Project (www.montblanc-project.eu), grant agreement no. 610402 and TIN2015-65316-P.
dc.format.extent12 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
dc.subject.lcshParallel processing (Electronic computers)
dc.subject.otherApplication programs
dc.subject.otherData flow analysis
dc.subject.otherError correction
dc.subject.otherErrors
dc.subject.otherHardware
dc.subject.otherReconfigurable hardware
dc.subject.otherReliability
dc.subject.otherCyclic redundancy check
dc.subject.otherDataflow model
dc.subject.otherError correction capability
dc.subject.otherHardware acceleration
dc.subject.otherMathematical analysis
dc.subject.otherMemory reliability
dc.subject.otherSoftware-based solutions
dc.subject.otherTask parallelism
dc.titleCRC-based memory reliability for task-parallel HPC applications
dc.typeConference report
dc.subject.lemacProcessament en paral·lel (Ordinadors)
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/IPDPS.2016.70
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionieeexplore.ieee.org/document/7516107/
dc.rights.accessRestricted access - publisher's policy
drac.iddocument18940030
dc.description.versionPostprint (published version)
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO/1PE/TIN2015-65316-P
dc.date.lift10000-01-01
upcommons.citation.authorSubasi, O., Unsal, O., Labarta, J., Yalcin, G., Cristal, A.
upcommons.citation.contributorIEEE International Symposium on Parallel and Distributed Processing
upcommons.citation.pubplaceChicago, Illinois
upcommons.citation.publishedtrue
upcommons.citation.publicationName2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS 2016): Chicago, Illinois, USA: 23-27 May 2016
upcommons.citation.startingPage1101
upcommons.citation.endingPage1112


Fitxers d'aquest items

Imatge en miniatura

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple

Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets