Ir al contenido (pulsa Retorno)

Universitat Politècnica de Catalunya

    • Català
    • Castellano
    • English
    • LoginRegisterLog in (no UPC users)
  • mailContact Us
  • world English 
    • Català
    • Castellano
    • English
  • userLogin   
      LoginRegisterLog in (no UPC users)

UPCommons. Global access to UPC knowledge

Banner header
59.755 UPC E-Prints
You are here:
View Item 
  •   DSpace Home
  • E-prints
  • Centres de recerca
  • BSC - Barcelona Supercomputing Center
  • Computer Sciences
  • Articles de revista
  • View Item
  •   DSpace Home
  • E-prints
  • Centres de recerca
  • BSC - Barcelona Supercomputing Center
  • Computer Sciences
  • Articles de revista
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Reducing cache coherence traffic with a NUMA-aware runtime approach

Thumbnail
View/Open
Reducing+Cache+Coherence+Traffic+with+a.pdf (1,988Mb)
Share:
 
 
10.1109/TPDS.2017.2787123
 
  View Usage Statistics
Cita com:
hdl:2117/116365

Show full item record
Caheny, Paul
Álvarez Martí, LlucMés informació
Derradji, Said
Valero Cortés, MateoMés informacióMés informacióMés informació
Moreto Planas, MiquelMés informacióMés informacióMés informació
Casas Guix, Marc
Document typeArticle
Defense date2018-05
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder
ProjectMont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
Abstract
Cache Coherent NUMA (ccNUMA) architectures are a widespread paradigm due to the benefits they provide for scaling core count and memory capacity. Also, the flat memory address space they offer considerably improves programmability. However, ccNUMA architectures require sophisticated and expensive cache coherence protocols to enforce correctness during parallel executions, which trigger a significant amount of on- and off-chip traffic in the system. This paper analyses how coherence traffic may be best constrained in a large, real ccNUMA platform comprising 288 cores through the use of a joint hardware/software approach. For several benchmarks, we study coherence traffic in detail under the influence of an added hierarchical cache layer in the directory protocol combined with runtime managed NUMA-aware scheduling and data allocation techniques to make most efficient use of the added hardware. The effectiveness of this joint approach is demonstrated by speedups of 3.14× to 9.97× and coherence traffic reductions of up to 99% in comparison to NUMA-oblivious scheduling and data allocation.
CitationCaheny, P., Alvarez, L., Derradji, S., Valero, M., Moreto, M., Casas, M. Reducing cache coherence traffic with a NUMA-aware runtime approach. "IEEE transactions on parallel and distributed systems", Maig 2018, vol. 29, núm. 5, p. 1174-1187. 
URIhttp://hdl.handle.net/2117/116365
DOI10.1109/TPDS.2017.2787123
ISSN1045-9219
Publisher versionhttp://ieeexplore.ieee.org/document/8239832/
Collections
  • Computer Sciences - Articles de revista [277]
  • Departament d'Arquitectura de Computadors - Articles de revista [967]
  • CAP - Grup de Computació d'Altes Prestacions - Articles de revista [380]
Share:
 
  View Usage Statistics

Show full item record

FilesDescriptionSizeFormatView
Reducing+Cache+Coherence+Traffic+with+a.pdf1,988MbPDFView/Open

Browse

This CollectionBy Issue DateAuthorsOther contributionsTitlesSubjectsThis repositoryCommunities & CollectionsBy Issue DateAuthorsOther contributionsTitlesSubjects

© UPC Obrir en finestra nova . Servei de Biblioteques, Publicacions i Arxius

info.biblioteques@upc.edu

  • About This Repository
  • Contact Us
  • Send Feedback
  • Privacy Settings
  • Inici de la pàgina