Reusing cached schedules in an out-of-order processor with in-order issue logic

Palomar Pérez, Óscar

doi:10.5821/dissertation-2117-94564

dc.contributor	Hormigo, Antonio Juan
dc.contributor	Navarro, Juan J. (Juan José)
dc.contributor.author	Palomar Pérez, Óscar
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned	2012-04-25T08:55:32Z
dc.date.available	2012-04-25T08:55:32Z
dc.date.issued	2011-05-09
dc.identifier.citation	Palomar Pérez, Ó. Reusing cached schedules in an out-of-order processor with in-order issue logic. Tesi doctoral, UPC, Departament d'Arquitectura de Computadors, 2011. DOI 10.5821/dissertation-2117-94564.
dc.identifier.uri	http://hdl.handle.net/2117/94564
dc.description.abstract	Modern processors use out-of-order processing logic to achieve high performance in Instructions Per Cycle (IPC) but this logic has a serious impact on the achievable frequency. In order to get better performance out of smaller transistors there is a trend to increase the number of cores per die instead of making the cores themselves bigger. Moreover, for throughput-oriented and server workloads, simpler in-order processors that allow more cores per die and higher design frequencies are becoming the preferred choice. Unfortunately, for other workloads this type of cores result in a lower single thread performance. There are many workloads where it is still important to achieve good single thread performance. In this thesis we present the ReLaSch processor. Its aim is to enable high IPC cores capable of running at high clock frequencies by processing the instructions using simple superscalar in-order issue logic and caching instruction groups that are dynamically scheduled in hardware after commit, that is, out of the critical path and only when really needed. Objective This thesis has several research goals: • Show that the dynamic scheduler of a conventional out-of-order processor does a lot of redundant work because it ignores the repetitiveness of code. • Propose a complete superscalar out-of-order architecture that reduces the amount of redundant work done by creating the schedules once in dedicated hardware, storing them in a cache of schedules and reusing the schedules as much as possible. • Place the scheduler out of the critical path of execution, which should be enabled by the reduction of work that it must do. Thus, the execution path of our proposed processor can be simpler than that of a conventional out-of-order processor. Proposal and results We present the \textbf{ReLaSch} processor, named after Reused Late Schedules, in which the creation of issue-groups is removed from the critical path of execution and uses a simple and small in-order issue logic. It just wakes-up and selects the instructions of a single issue-group each cycle, instead of processing the instructions of a whole issue queue. A new logic at the end of the conventional pipeline schedules the committed instructions. The new scheduler can be complex since it is not in the critical path of execution. The schedules are cached and whenever it is possible an rgroup is read and its instructions executed. The schedules are reused, lowering the pressure on the scheduling logic. In some cases, the ReLaSch processor is able to outperform a conventional out-of-order processor, because the post-commit scheduler has a broader vision of the code. For instance, while ReLaSch can schedule together two independent instructions that are distant in the code, a conventional out-oforder processor only issues them in the same cycle if both are in-flight. The ReLaSch processor predicts the branch targets, memory aliases and latencies at scheduling time, out of the critical path. The prediction is based on the most recent executions at scheduling time. Furthermore, most of the register renaming process is performed by the scheduler and is removed from the execution pipeline. Our experiments show that ReLaSch has the same average IPC as our reference out-of-order processor and is clearly better than the reference inorder processor (1.55 speed-up). In all cases it outperforms the in-order processor and in 23 benchmarks out of 40 it has a higher IPC than the reference out-of-order processor.
dc.format.extent	204 p.
dc.language.iso	eng
dc.publisher	Universitat Politècnica de Catalunya
dc.rights	ADVERTIMENT. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs.
dc.source	TDX (Tesis Doctorals en Xarxa)
dc.subject	Àrees temàtiques de la UPC::Informàtica
dc.subject.other	Issue logic
dc.subject.other	In-order processor
dc.subject.other	Out-of-order processor
dc.title	Reusing cached schedules in an out-of-order processor with in-order issue logic
dc.type	Doctoral thesis
dc.subject.lemac	Tractament del senyal -- Tècniques digitals
dc.subject.lemac	Programació lògica
dc.identifier.doi	10.5821/dissertation-2117-94564
dc.identifier.dl	B. 17062-2012
dc.rights.access	Open Access
dc.description.version	Postprint (published version)
dc.identifier.tdx	http://hdl.handle.net/10803/80536

Fitxers d'aquest items

Nom:: TOPP1de1.pdf
Mida:: 1,759Mb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Departament d'Arquitectura de Computadors [361]
Totes les tesis [5.459]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Reusing cached schedules in an out-of-order processor with in-order issue logic

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora