Mostra el registre d'ítem simple

dc.contributor.authorShafiq, Muhammad
dc.contributor.authorPericas, Miquel
dc.contributor.authorNavarro, Nacho
dc.contributor.authorAyguadé Parra, Eduard
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2014-07-23T08:56:15Z
dc.date.available2014-07-23T08:56:15Z
dc.date.created2013
dc.date.issued2013
dc.identifier.citationShafiq, M. [et al.]. Design space explorations for streaming accelerators using streaming architectural simulator. A: International Bhurban Conference on Applied Sciences and Technology. "Proceedings of 2013 10th International Bhurban Conference on Applied Sciences & Technology (IBCAST): 15th-19th January, 2013". Islamabad: Institute of Electrical and Electronics Engineers (IEEE), 2013, p. 169-178.
dc.identifier.isbn978-1-4673-4426-5
dc.identifier.urihttp://hdl.handle.net/2117/23588
dc.description.abstractIn the recent years streaming accelerators like GPUs have been pop-up as an effective step towards parallel computing. The wish-list for these devices span from having a support for thousands of small cores to a nature very close to the general purpose computing. This makes the design space very vast for the future accelerators containing thousands of parallel streaming cores. This complicates to exercise a right choice of the architectural configuration for the next generation devices. However, accurate design space exploration tools developed for the massively parallel architectures can ease this task. The main objectives of this work are twofold. (i) We present a complete environment of a trace driven simulator named SArcs (Streaming Architectural Simulator) for the streaming accelerators. (ii) We use our simulation tool-chain for the design space explorations of the GPU like streaming architectures. Our design space explorations for different architectural aspects of a GPU like device a e with reference to a base line established for NVIDIA's Fermi architecture (GPU Tesla C2050). The explored aspects include the performation effects by the variations in the configurations of Streaming Multiprocessors Global Memory Bandwidth, Channles between SMs down to Memory Hierarchy and Cache Hierarchy. The explorations are performed using application kernels from Vector Reduction, 2D-Convolution. Matrix-Matrix Multiplication and 3D-Stencil. Results show that the configurations of the computational resources for the current Fermi GPU device can deliver higher performance with further improvement in the global memory bandwidth for the same device.
dc.format.extent10 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
dc.subject.lcshHigh performance computing
dc.subject.lcshParallel programming (Computer science)
dc.titleDesign space explorations for streaming accelerators using streaming architectural simulator
dc.typeConference report
dc.subject.lemacProgramació en paral·lel (Informàtica)
dc.subject.lemacSuperordinadors
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/IBCAST.2013.6512151
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6512151
dc.rights.accessOpen Access
local.identifier.drac13639017
dc.description.versionPostprint (author’s final draft)
local.citation.authorShafiq, M.; Pericas, M.; Navarro, N.; Ayguade, E.
local.citation.contributorInternational Bhurban Conference on Applied Sciences and Technology
local.citation.pubplaceIslamabad
local.citation.publicationNameProceedings of 2013 10th International Bhurban Conference on Applied Sciences & Technology (IBCAST): 15th-19th January, 2013
local.citation.startingPage169
local.citation.endingPage178


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple