Show simple item record

dc.contributor.authorÁlvarez Mesa, Mauricio
dc.contributor.authorRamírez Bellido, Alejandro
dc.contributor.authorValero Cortés, Mateo
dc.contributor.authorAzevedo, Arnaldo
dc.contributor.authorMeenderinck, Cor
dc.contributor.authorJuurlink, Ben
dc.contributor.otherUniversitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned2010-07-01T10:19:11Z
dc.date.available2010-07-01T10:19:11Z
dc.date.created2009-04-23
dc.date.issued2009-04-23
dc.identifier.citationÁlvarez, M. [et al.]. Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture. A: 2009 Colombian Computing Conference. "Cuarto Congreso Colombiano de Computación, 4CCC: abril 23-25, 2009, Bucaramanga, Colombia". Bucaramanga: 2009, p. 108-117.
dc.identifier.isbn978-958-8166-43-8
dc.identifier.urihttp://hdl.handle.net/2117/7947
dc.description.abstractThis paper presents a study of the performance scalability of a macroblock-level parallelization of the H.264 decoder for High De nition (HD) applications on a multiprocessor architecture. We have implemented this parallelization on a cache coherent Non-uniform Memory Access (cc-NUMA) shared memory multiprocessor (SMP) and compared the results with the theoretical expectations. Three di erent scheduling techniques were analyzed: static, dynamic and dynamic with tail-submit. A dynamic scheduling approach with a tail-submit optimization presents the best performance obtaining a maximum speed-up of 9.5 using 24 processors. A detailed pro ling analysis showed that thread synchronization is one of the limiting factors for achieving a better parallel scalability. The paper includes an evaluation of the impact of using blocking synchronization APIs like POSIX threads and POSIX real-time extensions. Results showed that macroblock-level parallelism as a very negrain form of Thread-Level Parallelism (TLP) is highly affected by the thread synchronization overhead generated by these APIs. Other synchronization methods, possibly with hardware support, are required in order to make MB-level parallelization more scalable.
dc.format.extent10 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
dc.subject.lcshcc-NUMA multiprocessor architecture
dc.subject.lcshH.264
dc.titlePerformance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture
dc.typeConference report
dc.subject.lemacMultiprocessadors -- Avaluació
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttp://serverlab.unab.edu.co:8080/wikimedia/memorias/fullpapers/108.pdf
dc.rights.accessRestricted access - publisher's policy
drac.iddocument2574254
dc.description.versionPostprint (published version)
upcommons.citation.authorÁlvarez, M.; Ramírez, A.; Valero, M.; Azevedo, A.; Meenderinck, C.; Juurlink, B.
upcommons.citation.contributorColombian Computing Conference
upcommons.citation.pubplaceBucaramanga
upcommons.citation.publishedtrue
upcommons.citation.publicationNameCuarto Congreso Colombiano de Computación, 4CCC: abril 23-25, 2009, Bucaramanga, Colombia
upcommons.citation.startingPage108
upcommons.citation.endingPage117


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder