Several multithreading techniques have been proposed to reduce the resource underutilization in Very Long Instruction
Word (VLIW) processors. Simultaneous MultiThreading (SMT) is a popular technique which improves processor performance by issuing multiple instructions from different
threads. SMT requires extra hardware to merge instructions from different threads. The complexity of this hardware increases
substantially with the number of threads, limiting the number of threads that can be realistically supported to only 2. Cluster-level Simultaneous MultiThreading (CSMT)
is a technique that merges instructions from threads at the cluster level. CSMT has a much lower merging hardware cost and can support a larger number of threads. However,
CSMT performance is lower than SMT. In this paper, we evaluate several hardware designs that can support a high number of threads by using a merging scheme that combines both SMT and CSMT merging. For instance, one of the evaluated schemes, which merges the first 2 threads using SMT and the produced merging with other 2 threads by CSMT, achieves performance similar to supporting 4 threads by SMT but maintaining a reasonable merging hardware cost.
CitationGupta, M.; Sánchez, F.; Llosa, J. Thread merging schemes for multithreaded clustered VLIW processors. A: International Conference on Parallel Processing. "38th International Conference on Parallel Processing". Viena: 2009, p. 445-452.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder. If you wish to make any use of the work not provided for in the law, please contact: firstname.lastname@example.org