Show simple item record

dc.contributor.authorVilla, Luis
dc.contributor.authorEspasa Sans, Roger
dc.contributor.authorValero Cortés, Mateo
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2017-10-06T09:01:05Z
dc.date.available2017-10-06T09:01:05Z
dc.date.issued1997
dc.identifier.citationVilla, L., Espasa, R., Valero, M. Effective usage of vector registers in advanced vector architectures. A: International Conference on Parallel Architectures and Compilation Techniques. "1997 International Conference on Parallel Architectures and Compilation Techniques: San Francisco, California, November 10-14, 1997: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 1997, p. 250-260.
dc.identifier.isbn0-8186-8090-3
dc.identifier.urihttp://hdl.handle.net/2117/108431
dc.description.abstractThis paper presents data confirming the fact that traditional vector architectures can not reduce their vector register length without suffering a severe performance penalty. However, we will show that by combining the vector register length reduction with two different ILP techniques, decoupling and multithreading, the performance penalty can be made very small. We will show that each resulting architecture tolerates very well long memory latencies and also makes a better usage of the available storage space in each vector register. Using decoupling and short vectors, Each register can be halved while still providing speedups in the range 1.04-1.49 over a traditional architecture with long registers. Using multithreading. We split a vector register file in two halfs and show that two independent threads running on such machine can yield speedups in the range 1.23-1.29. The paper also explores configurations with 1/4 and 1/8 the original vector register size aimed at cost-conscious designs, and shows that even at 1/4 the original size, the resulting architectures can outperform a traditional machine. We also present results across a wide range of memory latencies, and show that the combination of short vectors and ILP techniques results in a very good tolerance of slow memory systems.
dc.format.extent11 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshParallel processing (Electronic computers)
dc.subject.otherPerformance evaluation
dc.subject.otherFile organisation
dc.titleEffective usage of vector registers in advanced vector architectures
dc.typeConference report
dc.subject.lemacProcessament en paral·lel (Ordinadors)
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/PACT.1997.644021
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ieeexplore.ieee.org/document/644021/
dc.rights.accessOpen Access
local.identifier.drac2325399
dc.description.versionPostprint (published version)
local.citation.authorVilla, L.; Espasa, R.; Valero, M.
local.citation.contributorInternational Conference on Parallel Architectures and Compilation Techniques
local.citation.publicationName1997 International Conference on Parallel Architectures and Compilation Techniques: San Francisco, California, November 10-14, 1997: proceedings
local.citation.startingPage250
local.citation.endingPage260


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder