LRU-PEA: A smart replacement policy for non-uniform cache architectures on chip multiprocessors
Document typeExternal research report
Rights accessOpen Access
The increasing speed-gap between processor and memory and the limited memory bandwidth make last-level cache performance crucial for CMP architectures. Non Uniform Cache Architectures (NUCA) has been introduced to deal with this problem. This memory organization divides the whole memory space into smaller pieces or banks allowing nearer banks to have better access latencies than further banks.Moreover, an adaptive replacement policy that efficiently reduces misses in the last-level cache could boost performance, particularly if set associativity is assumed. Unfortunately, traditional replacement policies do not behave properly as they were assumed for single-processors. This paper focuses on Bank Replacement. This policy involves three key decisions when there is a miss: where to place a data within the cache set, which data to evict from the cache set and finally, where to place the evicted data. We propose a novel replacement technique that enables more intelligent replacement decisions to be taken, based on the observation that some type of data are less commonly accessed depending of the bank where they reside. We call this technique as LRU-PEA (Least Recently Used with a Priority Eviction Approach). We show that the proposed technique significantly reduces the requests to the off-chip memory by increasing the hit ratio in the NUCA cache. This translates into an average IPC improvement of 8% and into an Energy per Instruction (EPI) reduction of 5%.
Is part ofUPC-DAC-RR-ARCO-2009-7