Mostra el registre d'ítem simple

dc.contributor.authorHayes, Timothy
dc.contributor.authorPalomar, Oscar
dc.contributor.authorUnsal, Osman Sabri
dc.contributor.authorCristal Kestelman, Adrián
dc.contributor.authorValero Cortés, Mateo
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.otherBarcelona Supercomputing Center
dc.date.accessioned2016-10-10T08:29:09Z
dc.date.available2016-10-10T08:29:09Z
dc.date.issued2016
dc.identifier.citationHayes, T., Palomar, O., Unsal, O., Cristal, A., Valero, M. Future vector microprocessor extensions for data aggregations. A: Annual International Symposium on Computer Architecture. "43rd International Symposium on Computer Architecture, ISCA 2016: 18-22 June 2016, Seoul, South Korea: proceedings". Seul: Institute of Electrical and Electronics Engineers (IEEE), 2016, p. 418-430.
dc.identifier.isbn978-1-4673-8947-1
dc.identifier.urihttp://hdl.handle.net/2117/90618
dc.description.abstractAs the rate of annual data generation grows exponentially, there is a demand to aggregate and summarise vast amounts of information quickly. In the past, frequency scaling was relied upon to push application throughput. Today, Dennard scaling has ceased and further performance must come from exploiting parallelism. Single instruction-multiple data (SIMD) instruction sets offer a highly efficient and scalable way of exploiting data-level parallelism (DLP). While microprocessors originally offered very simple SIMD support targeted at multimedia applications, these extensions have been growing both in width and functionality. Observing this trend, we use a simulation framework to model future SIMD support and then propose and evaluate five different ways of vectorising data aggregation. We find that although data aggregation is abundant in DLP, it is often too irregular to be expressed efficiently using typical SIMD instructions. Based on this observation, we propose a set of novel algorithms and SIMD instructions to better capture this irregular DLP. Furthermore, we discover that the best algorithm is highly dependent on the characteristics of the input. Our proposed solution can dynamically choose the optimal algorithm in the majority of cases and achieves speedups between 2.7x and 7.6x over a scalar baseline.
dc.description.sponsorshipThe research leading to these results has received funding from the RoMoL ERC Advanced Grant GA no 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P. Timothy Hayes is supported by a FPU research grant from the Spanish MECD.
dc.format.extent13 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshParallel processing (Electronic computers)
dc.subject.lcshMicroprocessors
dc.subject.otherRegisters
dc.subject.otherSupport vector machines
dc.subject.otherInstruction sets
dc.subject.otherData models
dc.titleFuture vector microprocessor extensions for data aggregations
dc.typeConference lecture
dc.subject.lemacProcessament en paral·lel (Ordinadors)
dc.subject.lemacMicroprocessadors
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/ISCA.2016.44
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ieeexplore.ieee.org/document/7551411/
dc.rights.accessOpen Access
local.identifier.drac18819475
dc.description.versionPostprint (published version)
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/FP7/321253/EU/Riding on Moore's Law/ROMOL
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
local.citation.authorHayes, T.; Palomar, O.; Unsal, O.; Cristal, A.; Valero, M.
local.citation.contributorAnnual International Symposium on Computer Architecture
local.citation.pubplaceSeul
local.citation.publicationName43rd International Symposium on Computer Architecture, ISCA 2016: 18-22 June 2016, Seoul, South Korea: proceedings
local.citation.startingPage418
local.citation.endingPage430


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple