Mostra el registre d'ítem simple

dc.contributor.authorO’Brien, Kathryn
dc.contributor.authorO'Brien, Kevin
dc.contributor.authorGonzález Tallada, Marc
dc.contributor.authorVujic, Nikola
dc.contributor.authorMartorell Bofill, Xavier
dc.contributor.authorAyguadé Parra, Eduard
dc.contributor.authorEichenberger, Alexandre E.
dc.contributor.authorChen, Tong
dc.contributor.authorSura, Zehra
dc.contributor.authorZhang, Tao
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2012-04-10T11:49:13Z
dc.date.available2012-04-10T11:49:13Z
dc.date.created2008
dc.date.issued2008
dc.identifier.citationGonzález, M. [et al.]. Hybrid access-specific software cache techniques for the cell BE architecture. A: International Conference on Parallel Architectures and Compilation Techniques. "PACT'08. Proceedings of the Seventeenth International Conference on Parallel Architectures and Compilation Techniques". Toronto: Association for Computing Machinery, 2008, p. 292-302.
dc.identifier.isbn978-1-60558-282-5
dc.identifier.urihttp://hdl.handle.net/2117/15715
dc.description.abstractEase of programming is one of the main impediments for the broad acceptance of multi-core systems with no hardware support for transparent data transfer between local and global memories. Software cache is a robust approach to provide the user with a transparent view of the memory architecture; but this software approach can suffer from poor performance. In this paper, we propose a hierarchical, hybrid software-cache architecture that classifies at compile time memory accesses in two classes, highlocality and irregular. Our approach then steers the memory references toward one of two specific cache structures optimized for their respective access pattern. The specific cache structures are optimized to enable high-level compiler optimizations to aggressively unroll loops, reorder cache references, and/or transform surrounding loops so as to practically eliminate the software cache overhead in the innermost loop. Performance evaluation indicates that improvements due to the optimized software-cache structures combined with the proposed codeoptimizations translate into 3.5 to 8.4 speedup factors, compared to a traditional software cache approach. As a result, we demonstrate that the Cell BE processor can be a competitive alternative to a modern server-class multi-core such as the IBM Power5 processor for a set of parallel NAS applications.
dc.format.extent11 p.
dc.language.isoeng
dc.publisherAssociation for Computing Machinery
dc.subjectÀrees temàtiques de la UPC::Informàtica::Enginyeria del software
dc.subject.lcshCache memory
dc.subject.lcshCompilers (Computer programs)
dc.subject.otherOpenMP
dc.subject.otherCompiler optimizations
dc.subject.otherLocal memories
dc.subject.otherMemory classification
dc.subject.otherSoftware cache
dc.titleHybrid access-specific software cache techniques for the cell BE architecture
dc.typeConference lecture
dc.subject.lemacMemòria cau
dc.subject.lemacCompiladors (Programes d'ordinador)
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1145/1454115.1454156
dc.description.peerreviewedPeer Reviewed
dc.rights.accessRestricted access - publisher's policy
local.identifier.drac2383697
dc.description.versionPostprint (published version)
local.citation.authorGonzález, M.; Vujic, N.; Martorell, X.; Ayguade, E.; Eichenberger, A.; Chen, T.; Sura, Z.; Zhang, T.; O'Brien, K.; O’Brien, K.
local.citation.contributorInternational Conference on Parallel Architectures and Compilation Techniques
local.citation.pubplaceToronto
local.citation.publicationNamePACT'08. Proceedings of the Seventeenth International Conference on Parallel Architectures and Compilation Techniques
local.citation.startingPage292
local.citation.endingPage302


Fitxers d'aquest items

Imatge en miniatura

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple