Mostra el registre d'ítem simple
Inherently workload-balanced clustered microarchitecture
dc.contributor.author | Abella Ferrer, Jaume |
dc.contributor.author | González Colás, Antonio María |
dc.contributor.other | Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors |
dc.date.accessioned | 2016-11-17T13:15:46Z |
dc.date.available | 2016-11-17T13:15:46Z |
dc.date.issued | 2005 |
dc.identifier.citation | Abella, J., Gonzalez, A. Inherently workload-balanced clustered microarchitecture. A: IEEE International Parallel and Distributed Processing Symposium. "19th IEEE International Parallel and Distributed Processing Syposium: April 4-8, 2005, Denver, Colorado: proceedings". Denver, Colorado: Institute of Electrical and Electronics Engineers (IEEE), 2005, p. 1-10. |
dc.identifier.isbn | 0-7695-2312-9 |
dc.identifier.uri | http://hdl.handle.net/2117/96789 |
dc.description.abstract | The performance of clustered microarchitectures relies on steering schemes that try to find the best trade-off between workload balance and inter-cluster communication penalties. In previously proposed clustered processors, reducing communication penalties and balancing the workload are opposite targets, since improving one usually implies a detriment in the other. In this paper we propose a new clustered microarchitecture that can minimize communication penalties without compromising workload balance. The key idea is to arrange the clusters in a ring topology in such a way that results of one cluster can be forwarded to the neighbor cluster with a very short latency. In this way, minimizing communication penalties is favored when the producer of a value and its consumer are placed in adjacent clusters, which also favors workload balance. The proposed microarchitecture is shown to outperform a state-of-the-art clustered processor. For instance, for an 8-cluster configuration and just one fully pipelined unidirectional bus, 15% speedup is achieved on average for FP programs. |
dc.format.extent | 10 p. |
dc.language.iso | eng |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) |
dc.subject | Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors |
dc.subject.lcsh | Microprocessors |
dc.subject.other | Microarchitecture |
dc.subject.other | Wire |
dc.subject.other | Computer architecture |
dc.subject.other | Topology |
dc.subject.other | Clocks |
dc.subject.other | Microprocessors |
dc.subject.other | Pipelines |
dc.subject.other | Process design |
dc.subject.other | Delay effects |
dc.subject.other | Energy consumption |
dc.title | Inherently workload-balanced clustered microarchitecture |
dc.type | Conference report |
dc.subject.lemac | Microprocessadors |
dc.contributor.group | Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors |
dc.identifier.doi | 10.1109/IPDPS.2005.258 |
dc.description.peerreviewed | Peer Reviewed |
dc.relation.publisherversion | http://ieeexplore.ieee.org/document/1419837/ |
dc.rights.access | Open Access |
local.identifier.drac | 2358481 |
dc.description.version | Postprint (published version) |
local.citation.author | Abella, J.; Gonzalez, A. |
local.citation.contributor | IEEE International Parallel and Distributed Processing Symposium |
local.citation.pubplace | Denver, Colorado |
local.citation.publicationName | 19th IEEE International Parallel and Distributed Processing Syposium: April 4-8, 2005, Denver, Colorado: proceedings |
local.citation.startingPage | 1 |
local.citation.endingPage | 10 |