Boosting single-thread performance in multi-core systems through fine-grain multi-threading
Document typeConference report
PublisherACM Press. Association for Computing Machinery
Rights accessRestricted access - publisher's policy
Industry has shifted towards multi-core designs as we have hit the memory and power walls. However, single thread performance remains of paramount importance since some applications have limited thread-level parallelism (TLP), and even a small part with limited TLP impose important constraints to the global performance, as explained by Amdahl’s law. In this paper we propose a novel approach for leveraging multiple cores to improve single-thread performance in a multi-core design. The proposed technique features a set of novel hardware mechanisms that support the execution of threads generated at compile time. These threads result from a fine-grain speculative decomposition of the original application and they are executed under a modified multi-core system that includes: (1) mechanisms to support multiple versions; (2) mechanisms to detect violations among threads; (3) mechanisms to reconstruct the original sequential order; and (4) mechanisms to checkpoint the architectural state and recovery to handle misspeculations. The proposed scheme outperforms previous hardware-only schemes to implement the idea of combining cores for executing single-thread applications in a multi-core design by more than 10% on average on Spec2006 for all configurations. Moreover, single-thread performance is improved by 41% on average when the proposed scheme is used on a Tiny Core, and up to 2.6x for some selected applications.
CitationMadriles, C. [et al.]. Boosting single-thread performance in multi-core systems through fine-grain multi-threading. A: International Symposium on Computer Architecture. "International Symposium on Computer Architecture". Austin, TX: ACM Press. Association for Computing Machinery, 2009, p. 474-483.