Show simple item record

dc.contributor.authorGonzález Colás, Antonio María
dc.contributor.authorTubella Murgadas, Jordi
dc.contributor.authorMolina, Carlos
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2017-06-09T09:54:02Z
dc.date.available2017-06-09T09:54:02Z
dc.date.issued1999
dc.identifier.citationGonzález, A., Tubella, J., Molina, C. Trace-level reuse. A: International Conference on Parallel Processing. "1999 InternationaI Conference on Parallel Processing: 21-24 September 1999, Aizu-Wakamatsu City, Japan: proceedings". Aizu-Wakamatsu: Institute of Electrical and Electronics Engineers (IEEE), 1999, p. 30-37.
dc.identifier.isbn0-7695-0350-0
dc.identifier.urihttp://hdl.handle.net/2117/105273
dc.description.abstractTrace-level reuse is based on the observation that some traces (dynamic sequences of instructions) are frequently repeated during the execution of a program, and in many cases, the instructions that make up such traces have the same source operand values. The execution of such traces will obviously produce the same outcome and thus, their execution can be skipped if the processor records the outcome of previous executions. This paper presents an analysis of the performance potential of trace-level reuse and discusses a preliminary realistic implementation. Like instruction-level reuse, trace-level reuse can improve performance by decreasing resource contention and the latency of some instructions. However, we show that trace-level reuse is more effective than instruction-level reuse because the former can avoid fetching the instructions of reused traces. This has two important benefits: it reduces the fetch bandwidth requirements, and it increases the effective instruction window size since these instructions do not occupy window entries. Moreover, trace-level reuse can compute all at once the result of a chain of dependent instructions, which may allow the processor to avoid the serialization caused by data dependences and thus, to potentially exceed the dataflow limit.
dc.format.extent8 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshMicroprocessors
dc.subject.lcshParallel processing (Electronic computers)
dc.subject.otherPerformance evaluation
dc.subject.otherMultiprocessing systems
dc.subject.otherInstruction sets
dc.subject.otherResource allocation
dc.titleTrace-level reuse
dc.typeConference report
dc.subject.lemacMicroprocessadors
dc.subject.lemacProcessament en paral·lel (Ordinadors)
dc.contributor.groupUniversitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
dc.identifier.doi10.1109/ICPP.1999.797385
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://ieeexplore.ieee.org/document/797385/
dc.rights.accessOpen Access
local.identifier.drac2394691
dc.description.versionPostprint (published version)
local.citation.authorGonzález, A.; Tubella, J.; Molina, C.
local.citation.contributorInternational Conference on Parallel Processing
local.citation.pubplaceAizu-Wakamatsu
local.citation.publicationName1999 InternationaI Conference on Parallel Processing: 21-24 September 1999, Aizu-Wakamatsu City, Japan: proceedings
local.citation.startingPage30
local.citation.endingPage37


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record