Exploració per autor "Martorell Bofill, Xavier"
Ara es mostren els items 1-20 de 139
-
A library implementation of the nano-threads programming model
Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Navarro, Nacho; Ayguadé Parra, Eduard (Springer, 1996)
Text en actes de congrés
Accés obertIn this paper we describe the design and implementation of a user-level thread package based on the nano-threads programming model, whose goal is to efficiently manage the application parallelism at user-level. Nano-thread ... -
A methodology approach to compare performance of parallel programming models for shared-memory architectures
Utrera Iglesias, Gladys Miriam; Gil, Marisa; Martorell Bofill, Xavier (Springer, 2020)
Capítol de llibre
Accés obertThe majority of current HPC applications are composed of complex and irregular data structures that involve techniques such as linear algebra, graph algorithms, and resource management, for which new platforms with varying ... -
A module-based cell processor simulator
Cabarcas Jaramillo, Felipe; Rico Carro, Alejandro; Rodenas, David; Martorell Bofill, Xavier; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC), 2006)
Comunicació de congrés
Accés obertAn interesting design alternative to replication-based chip multiprocessors is to create heterogeneous chip multiprocessors composed of several different cores, with one or more of them running the operating system and ... -
A novel asynchronous software cache implementation for the Cell-BE processor
Balart, J; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Sura, Z; Chen, T; Zhang, T; O'Brien, Kevin; O'Brien, Kathryn (2008-10)
Article
Accés restringit per política de l'editorialThis paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several services that allow the compiler to generate ... -
A proposal for error handling in OpenMP
Duran González, Alejandro; Ferrer, Roger; Costa Prats, Juan José; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2008)
Article
Accés restringit per política de l'editorialOpenMP has been focused in performance applied to numerical applications, but when we try to move this focus to other kind of applications, like Web servers, we detect one important lack. In these applications, performance ... -
A proposal for task-generating loops in OpenMP
Teruel, Xavier; Klemm, Michael; Li, Kelvin; Martorell Bofill, Xavier; Olivier, Stephen; Terboven, Christian (Springer, 2013)
Text en actes de congrés
Accés restringit per política de l'editorialWith the addition of the OpenMP* tasking model, programmers are able to improve and extend the parallelization opportunities of their codes. Programmers can also distribute the creation of tasks using a worksharing construct, ... -
A streaming machine description and programming model
Carpenter, Paul Matthew; Ródenas Picó, David; Martorell Bofill, Xavier; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (2007-07)
Article
Accés restringit per política de l'editorialIn this paper we present the initial development of a streaming environment based on a programming model and machine description. The stream programming model consists of an extension to the C language and it’s translation ... -
Accelerating boosting-based face detection on GPUs
Oro, David; Fernández, Carles; Segura, Carlos; Martorell Bofill, Xavier; Hernando Pericás, Francisco Javier (2012)
Text en actes de congrés
Accés restringit per política de l'editorialThe goal of face detection is to determine the presence of faces in arbitrary images, along with their locations and dimensions. As it happens with any graphics workloads, these algorithms benefit from data-level ... -
Accelerating software memory compression on the Cell/B.E.
Beltran Querol, Vicenç; Martorell Bofill, Xavier; Torres Viñals, Jordi; Ayguadé Parra, Eduard (2008)
Text en actes de congrés
Accés restringit per política de l'editorialThe idea of transparently compressing and decompressing the content of main memory to virtually enlarge their capacity has been previously proposed and studied in the literature. The rationale behind this idea lies in the ... -
Accelerating SpMV on FPGAs through block-row compress: a task-based approach
Oliver Segura, José; Álvarez Martínez, Carlos; Cervero García, Teresa; Martorell Bofill, Xavier; Davis, John D.; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 2023)
Comunicació de congrés
Accés obertSparse Matrix-Vector multiplication (SpMV), computing y=α⋅A×x+β⋅y where y,x are dense vectors, α,β two scalar constants, and A is a sparse matrix, is a key kernel in many HPC applications. It exhibits a kind of memory ... -
Achieving high memory performance from heterogeneous architectures with the SARC programming model
Ferrer, Roger; Beltran Querol, Vicenç; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (ACM, 2009)
Comunicació de congrés
Accés restringit per política de l'editorialCurrent heterogeneous multicore architectures, including the Cell/B.E., GPUs, and future developments, like Larrabee, require enormous programming efforts to efficiently run current parallel applications, achieving high ... -
ACOTES project: Advanced compiler technologies for embedded streaming
Duranton, M.; Munk, H.; Ayguadé Parra, Eduard; Bastoul, C.; Carpenter, Paul Matthew; Chamski, Z.; Cohen, A.; Cornero, M.; Dumont, P.; Pop, S.; Pop, A.; Ornstein, A.; Nuzman, D.; Miranda, C.; Martorell Bofill, Xavier; Lindwer, M.; Ladelsky, R.; Ferrer, Roger; Fellahi, M.; Pouchet, L. N; Zaks, A.; Shvadron, U.; Trifunovic, K.; Rohou, E.; Rosen, I.; Ramírez Bellido, Alejandro; Ródenas, D. (2011-04)
Article
Accés obertStreaming applications are built of data-driven, computational components, consuming and producing unbounded data streams. Streaming oriented systems have become dominant in a wide range of domains, including embedded ... -
An OpenMP* barrier using SIMD instructions for Intel® Xeon Phi™ coprocessor
Caballero, Diego; Duran González, Alejandro; Martorell Bofill, Xavier (Springer, 2013)
Text en actes de congrés
Accés restringit per política de l'editorialBarrier synchronisation is a widely-studied topic since the supercomputer era due to its significant impact on the overall performance of parallel applications. With the current shift to many-core architectures, such as ... -
Analyzing the impact of communication imbalance in high-speed networks
Utrera Iglesias, Gladys Miriam; Gil, Marisa; Martorell Bofill, Xavier (2017-12-21)
Article
Accés obertIn this work we analyze the communication load imbalance generated by irregular-data applications running in a multi-node cluster. Experimental approaches to diminish communication load imbalance are evaluated using a ... -
Analyzing the performance of hierarchical collective algorithms on ARM-based multicore clusters
Utrera Iglesias, Gladys Miriam; Gil, Marisa; Martorell Bofill, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2022)
Comunicació de congrés
Accés obertMPI is the de facto communication standard library for parallel applications in distributed memory architectures. Collective operations performance is critical in HPC applications as they can become the bottleneck of their ... -
Application acceleration on FPGAs with OmpSs@FPGA
Bosch, Jaume; Tan, Xubin; Filgueras Izquierdo, Antonio; Vidal, Miquel; Mateu, Marc; Jiménez-González, Daniel; Álvarez, Carlos; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2019)
Text en actes de congrés
Accés obertOmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous ... -
Applying interposition techniques for performance analysis of OPENMP parallel applications
González Tallada, Marc; Serra, Albert; Martorell Bofill, Xavier; Oliver Segura, José; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Navarro, Nacho (Institute of Electrical and Electronics Engineers (IEEE), 2000)
Text en actes de congrés
Accés obertTuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may arise. We believe that ... -
Asynchronous runtime with distributed manager for task-based programming models
Bosch Pons, Jaume; Álvarez Martínez, Carlos; Jiménez González, Daniel; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (2020-09)
Article
Accés obertParallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of ... -
Auto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systems
Pérez, Borja; Stafford, Esteban; Bosque Orero, José Luis; Beivide Palacio, Ramon; Mateo Bellido, Sergi; Teruel García, Xavier; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (2019-03-01)
Article
Accés obertThe emergence of heterogeneous systems has been very notable recently. The nodes of the most powerful computers integrate several compute accelerators, like GPUs. Profiting from such node configurations is not a trivial ... -
Automatic communication coalescing for irregular computations in UPC language
Alvanos, Michail; Tiotto, Ettore; Farreras Esclusa, Montserrat; Martorell Bofill, Xavier (IBM, 2012)
Text en actes de congrés
Accés restringit per política de l'editorialPartitioned Global Address Space (PGAS) languages appeared to address programmer productivity in large scale parallel machines. However, fine grain accesses on shared structures have been identified as one of the main ...