Mostra el registre d'ítem simple

dc.contributor.authorLópez Álvarez, David
dc.contributor.authorLlosa Espuny, José Francisco
dc.contributor.authorValero Cortés, Mateo
dc.contributor.authorAyguadé Parra, Eduard
dc.contributor.otherUniversitat Politècnica de Catalunya. Institut de Ciències de l'Educació
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2016-04-11T13:28:21Z
dc.date.available2016-04-11T13:28:21Z
dc.date.issued2001-10
dc.identifier.citationLópez, D., Llosa, J., Valero, M., Ayguadé, E. Cost-conscious strategies to increase performance of numerical programs on agressive VLIW architectures. "IEEE transactions on computers", Octubre 2001, vol. 50, núm. 10, p. 1033-1051.
dc.identifier.issn0018-9340
dc.identifier.urihttp://hdl.handle.net/2117/85498
dc.description.abstractLoops are the main time-consuming part of numerical applications. The performance of the loops is limited either by the resources offered by the architecture or by recurrences in the computation. To execute more operations per cycle, current processors are designed with growing degrees of resource replication (replication technique) for memory ports and functional units. However, the high cost in terms of area and cycle time of this technique precludes the use of high degrees of replication. High values for the cycle time may clearly offset any gain in terms of number of execution cycles. High values for the area may lead to an unimplementable configuration. An alternative to resource replication is resource widening (widening technique), which has also been used in some recent designs in which the width of the resources is increased (i.e., a single operation is performed over multiple data). Moreover, several general-purpose superscalar microprocessors have been implemented with multiply-add fused floating-point units (fusion technique), which reduces the latency of the combined operation and the number of resources used. The authors evaluate a broad set of VLIW processor design alternatives that combine the three techniques. We perform a technological projection for the next processor generations in order to foresee the possible implementable alternatives. From this study, we conclude that if the cost is taken into account, combining certain degrees of replication and widening in the hardware resources is more effective than applying only replication. Also, we confirm that multiply-add fused units will have a significant impact in raising the performance of future processor architectures with a reasonable increase in cost
dc.format.extent19 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshMicroprocessors
dc.subject.otherFloating point arithmetic
dc.subject.otherInstruction sets
dc.subject.otherMultiprocessing systems
dc.subject.otherParallel architectures
dc.subject.otherParallel programming
dc.subject.otherPipeline processing
dc.subject.otherProgram control structures
dc.titleCost-conscious strategies to increase performance of numerical programs on agressive VLIW architectures
dc.typeArticle
dc.subject.lemacMicroprocessadors
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/12.956090
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=956090
dc.rights.accessOpen Access
local.identifier.drac654435
dc.description.versionPostprint (published version)
local.citation.authorLópez, D.; Llosa, J.; Valero, M.; Ayguadé, E.
local.citation.publicationNameIEEE transactions on computers
local.citation.volume50
local.citation.number10
local.citation.startingPage1033
local.citation.endingPage1051


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple