Runtime-guided management of scratchpad memories in multicore architectures

View/Open
Document typeConference report
Defense date2015
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
The increasing number of cores and the anticipated level of heterogeneity in upcoming multicore architectures cause important problems in traditional cache hierarchies. A good way to alleviate these problems is to add scratchpad memories alongside the cache hierarchy, forming a hybrid memory hierarchy. This memory organization has the potential to improve performance and to reduce the power consumption and the on-chip network traffic, but exposing such a complex memory model to the programmer has a very negative impact on the programmability of the architecture. Emerging task-based programming models are a promising alternative to program heterogeneous multicore architectures. In these models the runtime system manages the execution of the tasks on the architecture, allowing them to apply many optimizations in a generic way at the runtime system level. This paper proposes giving the runtime system the responsibility to manage the scratchpad memories of a hybrid memory hierarchy in multicore processors, transparently to the programmer. In the envisioned system, the runtime system takes advantage of the information found in the task dependences to map the inputs and outputs of a task to the scratchpad memory of the core that is going to execute it. In addition, the paper exploits two mechanisms to overlap the data transfers with computation and a locality-aware scheduler to reduce the data motion. In a 32-core multicore architecture, the hybrid memory hierarchy outperforms cache-only hierarchies by up to 16%, reduces on-chip network traffic by up to 31% and saves up to 22% of the consumed power.
Description
© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
CitationÁlvarez, Ll., Moreto, M., Casas, M., Castillo, E., Martorell, X., Labarta, J., Ayguadé, E., Valero, M. Runtime-guided management of scratchpad memories in multicore architectures. A: International Conference on Parallel Architectures and Compilation Techniques. "24th International Conference on Parallel Architecture and Compilation: 18–21 October 2015, San Francisco: proceedings". San Francisco, CA: Institute of Electrical and Electronics Engineers (IEEE), 2015, p. 379-391.
ISBN978-1-4673-9524-3
Publisher versionhttp://dx.doi.org/10.1109/PACT.2015.26
Files | Description | Size | Format | View |
---|---|---|---|---|
lluca-pact2015-open.pdf | 4,194Mb | View/Open |