Dynamic-vector execution on a general purpose EDGE chip multiprocessor

Duric, Milovan; Palomar Pérez, Óscar; Smith, Aaron; Stanic, Milan; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo; Burger, Doug; Veidenbaum, Alexander V

doi:10.1109/SAMOS.2014.6893190

dc.contributor.author	Duric, Milovan
dc.contributor.author	Palomar Pérez, Óscar
dc.contributor.author	Smith, Aaron
dc.contributor.author	Stanic, Milan
dc.contributor.author	Unsal, Osman Sabri
dc.contributor.author	Cristal Kestelman, Adrián
dc.contributor.author	Valero Cortés, Mateo
dc.contributor.author	Burger, Doug
dc.contributor.author	Veidenbaum, Alexander V
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.contributor.other	Barcelona Supercomputing Center
dc.date.accessioned	2015-05-06T12:37:41Z
dc.date.created	2014
dc.date.issued	2014
dc.identifier.citation	Duric, M. [et al.]. Dynamic-vector execution on a general purpose EDGE chip multiprocessor. A: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation. "International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS XIV): proceedings: July 14-17, 2014: Samos, Greece". Samos: Institute of Electrical and Electronics Engineers (IEEE), 2014, p. 18-25.
dc.identifier.isbn	978-1-4799-3770-7
dc.identifier.uri	http://hdl.handle.net/2117/27788
dc.description.abstract	This paper proposes a cost-effective technique that morphs the available cores of a low power chip multiprocessor (CMP) into an accelerator for data parallel (DLP) workloads. Instead of adding a special-purpose vector architecture as an accelerator, our technique leverages the resources of each CMP core to mimic the functionality of a vector processor. The morphing provides dynamic vector execution (DVX) on a general purpose CMP, by adding minimal hardware for vector control. DVX enhances the vector execution by dynamically configuring the allocation of compute and memory resources to match particular workload requirements. As an energy efficient substrate, we utilize modest dual issue cores based on an Explicit Data Graph Execution (EDGE) architecture. The results show that a DVX enabled 4-core EDGE CMP improves the energy-delay product over 14x, at the cost of only 1.1% of additional area. We compare DVX against a CMP that adds a dedicated DLP accelerator based on a conventional high performance vector design. The vector accelerator increases the area footprint over 74%, which greatly affects the cost of the modest processor. DVX avoids the additional costs and yet gains over 86% of the speedup obtained with the dedicated accelerator.
dc.description.sponsorship	This work has been partially funded by the Spanish Government (TIN2012-34557), the European Research Council under the European Unions 7th FP (FP/2007-2013) / ERC GA n. 321253. and Microsoft Research
dc.format.extent	8 p.
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.subject	Àrees temàtiques de la UPC::Enginyeria electrònica::Microelectrònica
dc.subject	Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcsh	Microprocessors
dc.subject.lcsh	Computer architecture
dc.subject.other	DLP accelerator
dc.subject.other	DVX enabled 4-core EDGE CMP
dc.subject.other	EDGE architecture
dc.subject.other	Cost-effective technique
dc.subject.other	Data parallel workloads
dc.subject.other	Dedicated accelerator
dc.subject.other	Dynamic vector execution
dc.subject.other	Energy efficient substrate
dc.subject.other	Energy-delay product
dc.subject.other	Explicit data graph execution
dc.subject.other	Functionality
dc.subject.other	General purpose CMP
dc.subject.other	General purpose EDGE chip
dc.subject.other	Multiprocessor
dc.subject.other	High performance vector design
dc.subject.other	Low power chip multiprocessor
dc.subject.other	Minimal hardware
dc.subject.other	Modest processor
dc.subject.other	Special-purpose vector architecture
dc.subject.other	Vector accelerator
dc.subject.other	Vector control
dc.subject.other	Vector processor
dc.subject.other	Microprocessor chips
dc.subject.other	Multiprocessing systems
dc.subject.other	Computational modeling
dc.subject.other	Computer architecture
dc.subject.other	Hardware
dc.subject.other	Instruction sets
dc.subject.other	Message systems
dc.subject.other	Registers
dc.subject.other	Vectors
dc.title	Dynamic-vector execution on a general purpose EDGE chip multiprocessor
dc.type	Conference report
dc.subject.lemac	Microprocessadors
dc.subject.lemac	Arquitectura d'ordinadors
dc.contributor.group	Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi	10.1109/SAMOS.2014.6893190
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6893190
dc.rights.access	Restricted access - publisher's policy
local.identifier.drac	15430860
dc.description.version	Postprint (published version)
dc.relation.projectid	info:eu-repo/grantAgreement/EC/FP7/321253/EU/Riding on Moore's Law/ROMOL
dc.date.lift	10000-01-01
local.citation.author	Duric, M.; Palomar, O.; Smith, A.; Stanic, M.; Unsal, O.; Cristal, A.; Valero, M.; Burger, D.; Veidenbaum, A.V.
local.citation.contributor	International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation
local.citation.pubplace	Samos
local.citation.publicationName	International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS XIV): proceedings: July 14-17, 2014: Samos, Greece
local.citation.startingPage	18
local.citation.endingPage	25

Fitxers d'aquest items

Nom:: Dynamic-vector execution on a ...
Mida:: 1,872Mb
Format:: PDF
Descripció:: Dynamic-vector execution on a ...

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Ponències/Comunicacions de congressos [574]
Ponències/Comunicacions de congressos [784]
Ponències/Comunicacions de congressos [1.954]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Dynamic-vector execution on a general purpose EDGE chip multiprocessor

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora