DSpace DSpace UPC
 Català   Castellano   English  

E-prints UPC >
Altres >
Enviament des de DRAC >

Empreu aquest identificador per citar o enllaçar aquest ítem: http://hdl.handle.net/2117/11345

Ítem no disponible en accés obert per política de l'editorial

Arxiu Descripció MidaFormat
hipeac_ramirez_10.pdf353,38 kBAdobe PDF Accés restringit

Citació: Carpenter, P.; Alex Ramirez; Ayguade, E. Buffer sizing for self-timed stream programs on heterogeneous distributed memory multiprocessors. A: International Conference on High Performance Embedded Architectures & Compilers (HiPEAC). "HiPEAC 2010 International conference on High-Performance Embedded Architectures and Compilers". Pisa: Springer Verlag, 2010, p. 96-110.
Títol: Buffer sizing for self-timed stream programs on heterogeneous distributed memory multiprocessors
Autor: Carpenter, Paul; Ramírez Bellido, Alejandro Veure Producció científica UPC; Ayguadé Parra, Eduard Veure Producció científica UPC
Editorial: Springer Verlag
Data: 2010
Tipus de document: Conference report
Resum: Stream programming is a promising way to expose concurrency to the compiler. A stream program is built from kernels that communicate only via point-to-point streams. The stream compiler statically allocates these kernels to processors, applying blocking, fission and fusion transformations. The compiler determines the sizes of the communication buffers, which affects performance since local memories can be small. In this paper, we propose a feedback-directed algorithm that determines the size of each communication buffer, based on i) the stream program that has been mapped onto processors, ii) feedback from an earlier execution, and iii) the memory constraints. The algorithm exposes a trade-off between throughput and latency. It is general, in that it applies to stream programs with unstructured stream graphs, and it supports variable execution times and communication rates. We show results for the StreamIt benchmarks and random graphs. For the StreamIt benchmarks, throughput is optimal after the first iteration. For random graphs with stochastic computation times, throughput is within 3% of optimal after four iterations. Compared with the previous general algorithm, by Basten and Hoogerbrugge, our algorithm has significantly better performance and latency.
ISBN: 978-3-642-11515-8
URI: http://hdl.handle.net/2117/11345
DOI: 10.1007/978-3-642-11515-8_9
Versió de l'editor: http://www.springerlink.com/content/0g143884xj21n085/fulltext.pdf
Apareix a les col·leccions:CAP - Grup de Computació d´Altes Prestacions. Ponències/Comunicacions de congressos
Departament d'Arquitectura de Computadors. Ponències/Comunicacions de congressos
Altres. Enviament des de DRAC
Comparteix:


Stats Mostra les estadístiques d'aquest ítem

SFX Query

Aquest ítem (excepte textos i imatges no creats per l'autor) està subjecte a una llicència de Creative Commons Llicència Creative Commons
Creative Commons

 

Valid XHTML 1.0! Programari DSpace Copyright © 2002-2004 MIT and Hewlett-Packard Comentaris
Universitat Politècnica de Catalunya. Servei de Biblioteques, Publicacions i Arxius