Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture

Álvarez Mesa, Mauricio; Ramírez Bellido, Alejandro; Valero Cortés, Mateo; Azevedo, Arnaldo; Meenderinck, Cor; Juurlink, Ben

dc.contributor.author	Álvarez Mesa, Mauricio
dc.contributor.author	Ramírez Bellido, Alejandro
dc.contributor.author	Valero Cortés, Mateo
dc.contributor.author	Azevedo, Arnaldo
dc.contributor.author	Meenderinck, Cor
dc.contributor.author	Juurlink, Ben
dc.contributor.other	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
dc.date.accessioned	2010-07-01T10:19:11Z
dc.date.available	2010-07-01T10:19:11Z
dc.date.created	2009-04-23
dc.date.issued	2009-04-23
dc.identifier.citation	Álvarez, M. [et al.]. Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture. A: 2009 Colombian Computing Conference. "Cuarto Congreso Colombiano de Computación, 4CCC: abril 23-25, 2009, Bucaramanga, Colombia". Bucaramanga: 2009, p. 108-117.
dc.identifier.isbn	978-958-8166-43-8
dc.identifier.uri	http://hdl.handle.net/2117/7947
dc.description.abstract	This paper presents a study of the performance scalability of a macroblock-level parallelization of the H.264 decoder for High De nition (HD) applications on a multiprocessor architecture. We have implemented this parallelization on a cache coherent Non-uniform Memory Access (cc-NUMA) shared memory multiprocessor (SMP) and compared the results with the theoretical expectations. Three di erent scheduling techniques were analyzed: static, dynamic and dynamic with tail-submit. A dynamic scheduling approach with a tail-submit optimization presents the best performance obtaining a maximum speed-up of 9.5 using 24 processors. A detailed pro ling analysis showed that thread synchronization is one of the limiting factors for achieving a better parallel scalability. The paper includes an evaluation of the impact of using blocking synchronization APIs like POSIX threads and POSIX real-time extensions. Results showed that macroblock-level parallelism as a very negrain form of Thread-Level Parallelism (TLP) is highly affected by the thread synchronization overhead generated by these APIs. Other synchronization methods, possibly with hardware support, are required in order to make MB-level parallelization more scalable.
dc.format.extent	10 p.
dc.language.iso	eng
dc.subject	Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
dc.subject.lcsh	cc-NUMA multiprocessor architecture
dc.subject.lcsh	H.264
dc.title	Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture
dc.type	Conference report
dc.subject.lemac	Multiprocessadors -- Avaluació
dc.contributor.group	Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.description.peerreviewed	Peer Reviewed
dc.relation.publisherversion	http://serverlab.unab.edu.co:8080/wikimedia/memorias/fullpapers/108.pdf
dc.rights.access	Restricted access - publisher's policy
local.identifier.drac	2574254
dc.description.version	Postprint (published version)
local.citation.author	Álvarez, M.; Ramírez, A.; Valero, M.; Azevedo, A.; Meenderinck, C.; Juurlink, B.
local.citation.contributor	Colombian Computing Conference
local.citation.pubplace	Bucaramanga
local.citation.publicationName	Cuarto Congreso Colombiano de Computación, 4CCC: abril 23-25, 2009, Bucaramanga, Colombia
local.citation.startingPage	108
local.citation.endingPage	117

Fitxers d'aquest items

Nom:: Performance evaluation of ...
Mida:: 309,1Kb
Format:: PDF

Visualitza/Obre

Aquest ítem apareix a les col·leccions següents

Ponències/Comunicacions de congressos [784]
Ponències/Comunicacions de congressos [1.955]

Mostra el registre d'ítem simple

UPCommons. Portal del coneixement obert de la UPC

Performance evaluation of macroblock-level parallelization of H.264 decoding on a cc-NUMA multiprocessor architecture

Fitxers d'aquest items

Aquest ítem apareix a les col·leccions següents

Explora