FaulTM: Error detection and recovery using hardware transactional memory

dc.contributor.authorYalcin, Gulay
dc.contributor.authorUnsal, Osman Sabri
dc.contributor.authorCristal Kestelman, Adrián
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2014-06-19T08:54:18Z
dc.date.created2013
dc.date.issued2013
dc.description.abstractReliability is an essential concern for processor designers due to increasing transient and permanent fault rates. Executing instruction streams redundantly in chip multi processors (CMP) provides high reliability since it can detect both transient and permanent faults. Additionally, it also minimizes the Silent Data Corruption rate. However, comparing the results of the instruction streams, checkpointing the entire system and recovering from the detected errors might lead to substantial performance degradation. In this study we propose FaulTM, an error detection and recovery schema utilizing Hardware Transactional Memory (HTM) in order to reduce these performance degradations. We show how a minimally modified HTM that features lazy conflict detection and lazy data versioning can provide low-cost reliability in addition to HTM's intended purpose of supporting optimistic concurrency. Compared with lockstepping, FaulTM reduces the performance degradation by 2.5X for SPEC2006 benchmark.
dc.description.versionPostprint (published version)
dc.format.extent6 p.
dc.identifier.citationYalcin, G.; Unsal, O.; Cristal, A. FaulTM: Error detection and recovery using hardware transactional memory. A: Design, Automation and Test in Europe. "Design, Automation and Test in Europe: Grenoble, France, March 18 - 22, 2013". Grenoble: 2013, p. 220-225.
dc.identifier.doi10.7873/DATE.2013.058
dc.identifier.isbn978-398153700-0
dc.identifier.urihttps://hdl.handle.net/2117/23269
dc.language.isoeng
dc.rights.accessRestricted access - publisher's policy
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshSystem design
dc.subject.lemacDisseny de sistemes
dc.titleFaulTM: Error detection and recovery using hardware transactional memory
dc.typeConference report
dspace.entity.typePublication
local.citation.authorYalcin, G.; Unsal, O.; Cristal, A.
local.citation.contributorDesign, Automation and Test in Europe
local.citation.endingPage225
local.citation.publicationNameDesign, Automation and Test in Europe: Grenoble, France, March 18 - 22, 2013
local.citation.pubplaceGrenoble
local.citation.startingPage220
local.identifier.drac12870498

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
06513504.pdf
Mida:
905.83 KB
Format:
Adobe Portable Document Format
Descarregar (Accés restringit) Sol·licita una còpia a l'autor