Mostra el registre d'ítem simple
Filtering directory lookups in CMPs
dc.contributor.author | Bosque, Ana |
dc.contributor.author | Viñals Yufera, Víctor |
dc.contributor.author | Ibáñez, Pablo |
dc.contributor.author | Llaberia Griñó, José M. |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors |
dc.date.accessioned | 2011-07-18T10:36:17Z |
dc.date.available | 2011-07-18T10:36:17Z |
dc.date.created | 2010 |
dc.date.issued | 2010 |
dc.identifier.citation | Bosque, A. [et al.]. Filtering directory lookups in CMPs. A: Euromicro Conference on Digital System Design: Architectures, Methods and Tools. "13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools". Lille: 2010, p. 207-216. |
dc.identifier.uri | http://hdl.handle.net/2117/12997 |
dc.description.abstract | Coherence protocols consume an important fraction of power to determine which coherence action should take place. In this paper we focus on CMPs with a shared cache and a directory-based coherence protocol implemented as a duplicate of local caches tags. We observe that a big fraction of directory lookups produce a miss since the block looked up is not cached in any local cache. We propose to add a filter before the directory lookup in order to reduce the number of lookups to this structure. The filter identifies whether the current block was last accessed as a data or as an instruction. With this information, looking up the whole directory can be avoided for most accesses. We evaluate the filter in a CMP with 8 in-order processors with 4 threads each and a memory hierarchy with a shared L2 cache.We show that a filter with a size of 3% of the tag array of the shared cache can avoid more than 70% of all comparisons performed by directory lookups with a performance loss of just 0.2% for SPLASH2 and 1.5% for Specweb2005. On average, the number of 15-bit comparisons avoided per cycle is 54 out of 77 for SPLASH2 and 29 out of 41 for Specweb2005. In both cases, the filter requires less than one read of 1 bit per cycle. |
dc.format.extent | 10 p. |
dc.language.iso | eng |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors |
dc.subject.lcsh | Multiprocessors |
dc.subject.other | Cache storage |
dc.subject.other | Multiprocessing systems |
dc.subject.other | Protocols |
dc.title | Filtering directory lookups in CMPs |
dc.type | Conference report |
dc.subject.lemac | Multiprocessadors |
dc.contributor.group | Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions |
dc.identifier.doi | 10.1109/DSD.2010.85 |
dc.rights.access | Open Access |
local.identifier.drac | 2867203 |
dc.description.version | Postprint (published version) |
dc.relation.projectid | info:eu-repo/grantAgreement/EC/FP7/217068/EU/High Performance and Embedded Architecture and Compilation/HIPEAC |
local.citation.author | Bosque, A.; Viñals, V.; Ibáñez , P.; Llaberia, J. |
local.citation.contributor | Euromicro Conference on Digital System Design: Architectures, Methods and Tools |
local.citation.pubplace | Lille |
local.citation.publicationName | 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools |
local.citation.startingPage | 207 |
local.citation.endingPage | 216 |