A flexible heterogeneous multi-core architecture
View/Open
Cita com:
hdl:2117/112388
Document typeConference report
Defense date2007
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
Abstract
Multi-core processors naturally exploit thread-level parallelism (TLP). However, extracting instruction-level parallelism (ILP) from individual applications or threads is still a challenge as application mixes in this environment are nonuniform. Thus, multi-core processors should be flexible enough to provide high throughput for uniform parallel applications as well as high performance for more general workloads. Heterogeneous architectures are a first step in this direction, but partitioning remains static and only roughly fits application requirements. This paper proposes the Flexible Heterogeneous Mul-tiCore processor (FMC), the first dynamic heterogeneous multi-core architecture capable of reconfiguring itself to fit application requirements without programmer intervention. The basic building block of this microarchitecture is a scalable, variable-size window microarchitecture that exploits the concept of Execution Locality to provide large-window capabilities. This allows to overcome the memory wall for applications with high memory-level parallelism (MLP). The microarchitecture contains a set of small and fast cache processors that execute high locality code and a network of small in-order memory engines that together exploit low locality code. Single-threaded applications can use the entire network of cores while multi-threaded applications can efficiently share the resources. The sizing of critical structures remains small enough to handle current power envelopes. In single-threaded mode this processor is able to outperform previous state-of-the-art high-performance processor research by 12% on SpecFP. We show how in a quad- threaded/quad-core environment the processor outperforms a statically allocated configuration in both throughput and harmonic mean, two commonly used metrics to evaluate SMTperformance, by around 2-4%. This is achieved while using a very simple sharing algorithm.
CitationPericàs, M., Cristal, A., Cazorla, F., González, R., Jiménez, D. A., Valero, M. A flexible heterogeneous multi-core architecture. A: International Conference on Parallel Architectures and Compilation Techniques. "16th International Conference on Parallel Architecture and Compilation Techniques, PACT 2007: 15-19 September 2007, Brasov, Romania". Brasov: Institute of Electrical and Electronics Engineers (IEEE), 2007, p. 13-24.
ISBN978-0-7695-2944-8
Publisher versionhttp://ieeexplore.ieee.org/document/4336196/
Files | Description | Size | Format | View |
---|---|---|---|---|
04336196.pdf | 527,1Kb | View/Open |