Show simple item record

dc.contributor.authorGupta, Manoj
dc.contributor.authorLlosa Espuny, José Francisco
dc.contributor.authorSánchez Carracedo, Fermín
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2010-02-25T11:36:55Z
dc.date.available2010-02-25T11:36:55Z
dc.date.created2007-07
dc.date.issued2007-07
dc.identifier.citationGupta, M.; Llosa, J.; Sánchez, F. Performance evaluation of cluster-level SMT VLIW processors. A: Advanced Computer Architecture and Compilation for Embedded Systems. "ACACES 2007: poster abstracts: July 18, 2007, L'Aquila, Italy". L'Aquila: 2007, p. 185-188.
dc.identifier.urihttp://hdl.handle.net/2117/6468
dc.description.abstractClustered VLIW embedded processors have become widespread due to benefits of simple hardware and low power. However, while some applications exhibit large amounts of instruction level parallelism (ILP) and benefit from very wide machines, others have little ILP, which wastes precious resources in wide processors. Simultaneous MultiThreading (SMT) is a well known technique that improves resource utilization by exploiting thread level parallelism at the instruction grain level. However, implementing SMT for VLIWs requires complex structures. CSMT (Clusterlevel Simultaneous MultiThreading) allows some degree of SMT in clustered VLIW processors. CSMT considers the set of operations that execute simultaneously in a given cluster (named bundle)as the assignment unit. All bundles belonging to a VLIW instruction from a given thread are issued simultaneously. To minimize cluster conflicts between threads, a very simple hardwarebased cluster renaming mechanism is proposed. The experimental results show that CSMT significantly improves ILP when compared with other multithreading approaches suited for VLIW. For instance, with 4 threads CSMT shows an average speedup of 113% over a single-thread VLIW architecture and 36% over Interleaved MultiThreading (IMT). In some cases, speedup can be as high as 228% over single thread architecture and 97% over IMT. Also CSMT for a 2-thread processor, achieves almost the same performance as IMT for a 4-thread processor and also outperforms it in some cases.
dc.format.extent4 p.
dc.language.isoeng
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Spain
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshSimultaneous multithreading processors
dc.subject.lcshEmbedded computer systems
dc.subject.otherClustering
dc.subject.otherVLIW processor
dc.subject.otherMultithreading
dc.subject.otherCSMT
dc.titlePerformance evaluation of CSMT for VLIW processors
dc.typeConference report
dc.subject.lemacMicroprocessadors
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.description.peerreviewedPeer Reviewed
dc.rights.accessOpen Access
drac.iddocument2377709
dc.description.versionPostprint (author’s final draft)
upcommons.citation.authorGupta, M.; Llosa, J.; Sánchez, F.;
upcommons.citation.contributorAdvanced Computer Architecture and Compilation for Embedded Systems
upcommons.citation.pubplaceL'Aquila
upcommons.citation.publishedtrue
upcommons.citation.publicationNameACACES 2007: poster abstracts: July 18, 2007, L'Aquila, Italy
upcommons.citation.startingPage185
upcommons.citation.endingPage188


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Except where otherwise noted, content on this work is licensed under a Creative Commons license: Attribution-NonCommercial-NoDerivs 3.0 Spain