Mostra el registre d'ítem simple

dc.contributor.authorBeltran Querol, Vicenç
dc.contributor.authorCarrera Pérez, David
dc.contributor.authorTorres Viñals, Jordi
dc.contributor.authorAyguadé Parra, Eduard
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2010-10-27T10:25:08Z
dc.date.available2010-10-27T10:25:08Z
dc.date.created2009-12-16
dc.date.issued2009-12-16
dc.identifier.citationBeltran, V. [et al.]. CellMT: A cooperative multithreading library for the Cell/B.E.. A: International Conference on High Performance Computing. "16th International Conference on High Performance Computing". Kochi: IEEE Computer Society Publications, 2009, p. 245-253.
dc.identifier.urihttp://hdl.handle.net/2117/10016
dc.description.abstractThe Cell BE processor has proved that heterogeneous multi-core systems can provide a huge computational power with high efficiency for a wide range of applications. The simple design of the computational units and the use of small managed local memories is the key to achieve high efficiency and performance at the same time. However, this simple and efficient hardware design comes at the price of higher code complexity. The code written to run in this kind of processors must deal with several issues such as code vectorization, loop unrolling or the explicit management of local memories. Some of these issues such as vectorization or loop unrolling can be partially solved by the compiler, but the overlapping of data transfer and computation times must be manually addressed by the programmer with techniques such as double buffering that increase the code complexity. In this paper we present a user level threading library called CellMT that effectively hide memory latencies. The concurrent execution of several threads inside each SPU naturally overlaps computation and data transfer times without increasing the code complexity. To prove the suitability and feasibility of our multi-threaded library, we perform an exhaustive performance evaluation with a synthetic benchmark and a real application. The experimental results show that the multithreaded approach can outperform a hand-coded double buffering scheme, with speedups from 0.96x to 3.2x, while maintaining the complexity of a naive buffering scheme.
dc.format.extent9 p.
dc.language.isoeng
dc.publisherIEEE Computer Society Publications
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshCompilers (Computer programs)
dc.subject.otherMultithreading
dc.subject.otherMultiprocessing systems
dc.subject.otherProgram compilers
dc.titleCellMT: A cooperative multithreading library for the Cell/B.E.
dc.typeConference report
dc.subject.lemacCompiladors (Programes d'ordinador)
dc.subject.lemacMultiprocessadors
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/HIPC.2009.5433205
dc.description.peerreviewedPeer Reviewed
dc.rights.accessOpen Access
local.identifier.drac2339805
dc.description.versionPostprint (published version)
local.citation.authorBeltran, V.; Carrera, D.; Torres, J.; Ayguade, E.
local.citation.contributorInternational Conference on High Performance Computing
local.citation.pubplaceKochi
local.citation.publicationName16th International Conference on High Performance Computing
local.citation.startingPage245
local.citation.endingPage253


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple