Predicated execution and register windows for out-of-order processors
Visualitza/Obre
10.5821/dissertation-2117-93337
Inclou dades d'ús des de 2022
Cita com:
hdl:2117/93337
Càtedra / Departament / Institut
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Tipus de documentTesi
Data de defensa2008-11-18
EditorUniversitat Politècnica de Catalunya
Condicions d'accésAccés obert
Tots els drets reservats. Aquesta obra està protegida pels drets de propietat intel·lectual i
industrial corresponents. Sense perjudici de les exempcions legals existents, queda prohibida la seva
reproducció, distribució, comunicació pública o transformació sense l'autorització del titular dels drets
Abstract
ISA extensions are a very powerful approach to implement new hardware techniques that require or benefit from compiler support: decisions made at compile time can be complemented at runtime, achieving a synergistic effect between the compiler and the processor. This thesis is focused on two ISA extensions: predicate execution and register windows. Predicate execution is exploited by the if-conversion compiler technique. If-conversion removes control dependences by transforming them to data dependences, which helps to exploit ILP beyond a single basic-block. Register windows help to reduce the amount of loads and stores required to save and restore registers across procedure calls by storing multiple contexts into a large architectural register file.
In-order processors specially benefit from using both ISA extensions to overcome the limitations that control dependences and memory hierarchy impose on static scheduling. Predicate execution allows to move control dependence instructions past branches. Register windows reduce the amount of memory operations across procedure calls. Although if-conversion and register windows techniques have not been exclusively developed for in-order processors, their use for out-of-order processors has been studied very little. In this thesis we show that the uses of if-conversion and register windows introduce new performance opportunities and new challenges to face in out-of-order processors.
The use of if-conversion in out-of-order processors helps to eliminate hard-to-predict branches, alleviating the severe performance penalties caused by branch mispredictions. However, the removal of some conditional branches by if-conversion may adversely affect the predictability of the remaining branches, because it may reduce the amount of correlation information available to the branch predictor. Moreover, predicate execution in out-of-order processors has to deal with two performance issues. First, multiple definitions of the same logical register can be merged into a single control flow, where each definition is guarded with a different predicate. Second, instructions whose guarding predicate evaluates to false consume unnecessary resources. This thesis proposes a branch prediction scheme based on predicate prediction that solves the three problems mentioned above. This scheme, which is built on top of a predicated ISA that implement a compare-and-branch model such as the one considered in this thesis, has two advantages: First, the branch accuracy is improved because the correlation information is not lost after if-conversion and the mechanism we propose permits using the computed value of the branch predicate when available, achieving 100% of accuracy. Second it avoids the predicate out-of-order execution problems.
Regarding register windows, we propose a mechanism that reduces physical register requirements of an out-of-order processor to the bare minimum with almost no performance loss. The mechanism is based on identifying which architectural registers are in use by current in-flight instructions. The registers which are not in use, i.e. there is no in-flight instruction that references them, can be early released.
In this thesis we propose a very efficient and low-cost hardware implementation of predicate execution and register windows that provide important benefits to out-of-order processors.
In-order processors specially benefit from using both ISA extensions to overcome the limitations that control dependences and memory hierarchy impose on static scheduling. Predicate execution allows to move control dependence instructions past branches. Register windows reduce the amount of memory operations across procedure calls. Although if-conversion and register windows techniques have not been exclusively developed for in-order processors, their use for out-of-order processors has been studied very little. In this thesis we show that the uses of if-conversion and register windows introduce new performance opportunities and new challenges to face in out-of-order processors.
The use of if-conversion in out-of-order processors helps to eliminate hard-to-predict branches, alleviating the severe performance penalties caused by branch mispredictions. However, the removal of some conditional branches by if-conversion may adversely affect the predictability of the remaining branches, because it may reduce the amount of correlation information available to the branch predictor. Moreover, predicate execution in out-of-order processors has to deal with two performance issues. First, multiple definitions of the same logical register can be merged into a single control flow, where each definition is guarded with a different predicate. Second, instructions whose guarding predicate evaluates to false consume unnecessary resources. This thesis proposes a branch prediction scheme based on predicate prediction that solves the three problems mentioned above. This scheme, which is built on top of a predicated ISA that implement a compare-and-branch model such as the one considered in this thesis, has two advantages: First, the branch accuracy is improved because the correlation information is not lost after if-conversion and the mechanism we propose permits using the computed value of the branch predicate when available, achieving 100% of accuracy. Second it avoids the predicate out-of-order execution problems.
Regarding register windows, we propose a mechanism that reduces physical register requirements of an out-of-order processor to the bare minimum with almost no performance loss. The mechanism is based on identifying which architectural registers are in use by current in-flight instructions. The registers which are not in use, i.e. there is no in-flight instruction that references them, can be early released.
In this thesis we propose a very efficient and low-cost hardware implementation of predicate execution and register windows that provide important benefits to out-of-order processors.
CitacióQuiñones Moreno, E. Predicated execution and register windows for out-of-order processors. Tesi doctoral, UPC, Departament d'Arquitectura de Computadors, 2008. ISBN 9788469404348. DOI 10.5821/dissertation-2117-93337. Disponible a: <http://hdl.handle.net/2117/93337>
Dipòsit legalB.11723-2011
ISBN9788469404348
Altres identificadorshttp://www.tdx.cat/TDX-1118110-112648
Col·leccions
Fitxers | Descripció | Mida | Format | Visualitza |
---|---|---|---|---|
TEQM1de1.pdf | 3,839Mb | Visualitza/Obre |