Runtime-assisted cache coherence deactivation in task parallel programs
Cita com:
hdl:2117/125393
Document typeConference report
Defense date2018
PublisherAssociation for Computing Machinery (ACM)
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
ProjectCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
ROMOL - Riding on Moore's Law (EC-FP7-321253)
Mont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
Mont-Blanc 2020 - Mont-Blanc 2020, European scalable, modular and power efficient HPC processor (EC-H2020-779877)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
ROMOL - Riding on Moore's Law (EC-FP7-321253)
Mont-Blanc 3 - Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology (EC-H2020-671697)
Mont-Blanc 2020 - Mont-Blanc 2020, European scalable, modular and power efficient HPC processor (EC-H2020-779877)
COMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
Abstract
With increasing core counts, the scalability of directory-based cache coherence has become a challenging problem. To reduce the area and power needs of the directory, recent proposals reduce its size by classifying data as private or shared, and disable coherence for private data. However, existing classification methods suffer from inaccuracies and require complex hardware support with limited scalability.
This paper proposes a hardware/software co-designed approach: the runtime system identifies data that is guaranteed by the programming model semantics to not require coherence and notifies the microarchitecture. The microarchitecture deactivates coherence for this private data and powers off unused directory capacity. Our proposal reduces directory accesses to just 26% of the baseline system, and supports a 64x smaller directory with only 2.8% performance degradation. By dynamically calibrating the directory size our proposal saves 86% of dynamic energy consumption in the directory without harming performance.
CitationCaheny, P., Álvarez, L., Valero, M., Moreto, M., Casas, M. Runtime-assisted cache coherence deactivation in task parallel programs. A: International Conference for High Performance Computing, Networking, Storage, and Analysis. "Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis: Dallas, TX, USA, November 11-16, 2018". New York: Association for Computing Machinery (ACM), 2018, p. 1-12.
Publisher versionhttps://dl.acm.org/citation.cfm?id=3291703
Files | Description | Size | Format | View |
---|---|---|---|---|
Runtime-Assisted Cache Coherence Deactivation.pdf | 736,4Kb | View/Open |