Mostra el registre d'ítem simple

dc.contributor.authorOzen, Guray
dc.contributor.authorAyguadé Parra, Eduard
dc.contributor.authorLabarta Mancho, Jesús José
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.otherBarcelona Supercomputing Center
dc.date.accessioned2016-11-18T16:02:22Z
dc.date.issued2016
dc.identifier.citationOzen, G., Ayguade, E., Labarta, J. POSTER: collective dynamic parallelism for directive based GPU programming languages and compilers. A: International Conference on Parallel Architectures and Compilation Techniques. "PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation". Haifa: Association for Computing Machinery (ACM), 2016, p. 423-424.
dc.identifier.isbn978-1-4503-4121-9
dc.identifier.urihttp://hdl.handle.net/2117/96856
dc.description.abstractEarly programs for GPU (Graphics Processing Units) acceleration were based on a flat, bulk parallel programming model, in which programs had to perform a sequence of kernel launches from the host CPU. In the latest releases of these devices, dynamic (or nested) parallelism is supported, making possible to launch kernels from threads running on the device, without host intervention. Unfortunately, the overhead of launching kernels from the device is higher compared to launching from the host CPU, making the exploitation of dynamic parallelism unprofitable. This paper proposes and evaluates the basic idea behind a user-directed code transformation technique, named collective dynamic parallelism, that targets the effective exploitation of nested parallelism in modern GPUs. The technique dynamically packs dynamic parallelism kernel invocations and postpones their execution until a bunch of them are available. We show that for sparse matrix vector multiplication, CollectiveDP outperforms well optimized libraries, making GPU useful when matrices are highly irregular.
dc.format.extent2 p.
dc.language.isoeng
dc.publisherAssociation for Computing Machinery (ACM)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
dc.subject.lcshParallel processing (Electronic computers)
dc.subject.otherApplication programming interfaces (API)
dc.subject.otherComputer graphics
dc.subject.otherConcurrency control
dc.subject.otherCosine transforms
dc.subject.otherMemory architecture
dc.subject.otherParallel architectures
dc.subject.otherParallel programming
dc.subject.otherProgram processors
dc.subject.otherCUDA
dc.subject.otherGraphics Processing Unit
dc.subject.otherLanguages and compilers
dc.subject.otherNested Parallelism
dc.subject.otherOpenacc
dc.subject.otheropenmp
dc.subject.otherParallel programming model
dc.subject.otherSparse matrix-vector multiplication
dc.titlePOSTER: collective dynamic parallelism for directive based GPU programming languages and compilers
dc.typeConference report
dc.subject.lemacProcessament en paral·lel (Ordinadors)
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1145/2967938.2974056
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://dl.acm.org/citation.cfm?doid=2967938.2974056
dc.rights.accessRestricted access - publisher's policy
local.identifier.drac19160709
dc.description.versionPostprint (published version)
dc.date.lift10000-01-01
local.citation.authorOzen, G.; Ayguade, E.; Labarta, J.
local.citation.contributorInternational Conference on Parallel Architectures and Compilation Techniques
local.citation.pubplaceHaifa
local.citation.publicationNamePACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation
local.citation.startingPage423
local.citation.endingPage424


Fitxers d'aquest items

Imatge en miniatura

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple