Browsing by Author "Martorell Bofill, Xavier"
Now showing items 1-20 of 146
-
A library implementation of the nano-threads programming model
Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Navarro, Nacho; Ayguadé Parra, Eduard (Springer, 1996)
Conference report
Open AccessIn this paper we describe the design and implementation of a user-level thread package based on the nano-threads programming model, whose goal is to efficiently manage the application parallelism at user-level. Nano-thread ... -
A methodology approach to compare performance of parallel programming models for shared-memory architectures
Utrera Iglesias, Gladys Miriam; Gil, Marisa; Martorell Bofill, Xavier (Springer, 2020)
Part of book or chapter of book
Open AccessThe majority of current HPC applications are composed of complex and irregular data structures that involve techniques such as linear algebra, graph algorithms, and resource management, for which new platforms with varying ... -
A module-based cell processor simulator
Cabarcas Jaramillo, Felipe; Rico Carro, Alejandro; Rodenas, David; Martorell Bofill, Xavier; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC), 2006)
Conference lecture
Open AccessAn interesting design alternative to replication-based chip multiprocessors is to create heterogeneous chip multiprocessors composed of several different cores, with one or more of them running the operating system and ... -
A novel asynchronous software cache implementation for the Cell-BE processor
Balart, J; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Sura, Z; Chen, T; Zhang, T; O'Brien, Kevin; O'Brien, Kathryn (2008-10)
Article
Restricted access - publisher's policyThis paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several services that allow the compiler to generate ... -
A proposal for error handling in OpenMP
Duran González, Alejandro; Ferrer, Roger; Costa Prats, Juan José; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2008)
Article
Restricted access - publisher's policyOpenMP has been focused in performance applied to numerical applications, but when we try to move this focus to other kind of applications, like Web servers, we detect one important lack. In these applications, performance ... -
A proposal for task-generating loops in OpenMP
Teruel, Xavier; Klemm, Michael; Li, Kelvin; Martorell Bofill, Xavier; Olivier, Stephen; Terboven, Christian (Springer, 2013)
Conference report
Restricted access - publisher's policyWith the addition of the OpenMP* tasking model, programmers are able to improve and extend the parallelization opportunities of their codes. Programmers can also distribute the creation of tasks using a worksharing construct, ... -
A streaming machine description and programming model
Carpenter, Paul Matthew; Ródenas Picó, David; Martorell Bofill, Xavier; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (2007-07)
Article
Restricted access - publisher's policyIn this paper we present the initial development of a streaming environment based on a programming model and machine description. The stream programming model consists of an extension to the C language and it’s translation ... -
Accelerating boosting-based face detection on GPUs
Oro, David; Fernández, Carles; Segura, Carlos; Martorell Bofill, Xavier; Hernando Pericás, Francisco Javier (2012)
Conference report
Restricted access - publisher's policyThe goal of face detection is to determine the presence of faces in arbitrary images, along with their locations and dimensions. As it happens with any graphics workloads, these algorithms benefit from data-level ... -
Accelerating software memory compression on the Cell/B.E.
Beltran Querol, Vicenç; Martorell Bofill, Xavier; Torres Viñals, Jordi; Ayguadé Parra, Eduard (2008)
Conference report
Restricted access - publisher's policyThe idea of transparently compressing and decompressing the content of main memory to virtually enlarge their capacity has been previously proposed and studied in the literature. The rationale behind this idea lies in the ... -
Accelerating SpMV on FPGAs through block-row compress: a task-based approach
Oliver Segura, José; Álvarez Martínez, Carlos; Cervero García, Teresa; Martorell Bofill, Xavier; Davis, John D.; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 2023)
Conference lecture
Open AccessSparse Matrix-Vector multiplication (SpMV), computing y=α⋅A×x+β⋅y where y,x are dense vectors, α,β two scalar constants, and A is a sparse matrix, is a key kernel in many HPC applications. It exhibits a kind of memory ... -
Achieving high memory performance from heterogeneous architectures with the SARC programming model
Ferrer, Roger; Beltran Querol, Vicenç; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (ACM, 2009)
Conference lecture
Restricted access - publisher's policyCurrent heterogeneous multicore architectures, including the Cell/B.E., GPUs, and future developments, like Larrabee, require enormous programming efforts to efficiently run current parallel applications, achieving high ... -
ACOTES project: Advanced compiler technologies for embedded streaming
Duranton, M.; Munk, H.; Ayguadé Parra, Eduard; Bastoul, C.; Carpenter, Paul Matthew; Chamski, Z.; Cohen, A.; Cornero, M.; Dumont, P.; Pop, S.; Pop, A.; Ornstein, A.; Nuzman, D.; Miranda, C.; Martorell Bofill, Xavier; Lindwer, M.; Ladelsky, R.; Ferrer, Roger; Fellahi, M.; Pouchet, L. N; Zaks, A.; Shvadron, U.; Trifunovic, K.; Rohou, E.; Rosen, I.; Ramírez Bellido, Alejandro; Ródenas, D. (2011-04)
Article
Open AccessStreaming applications are built of data-driven, computational components, consuming and producing unbounded data streams. Streaming oriented systems have become dominant in a wide range of domains, including embedded ... -
An OpenMP* barrier using SIMD instructions for Intel® Xeon Phi™ coprocessor
Caballero, Diego; Duran González, Alejandro; Martorell Bofill, Xavier (Springer, 2013)
Conference report
Restricted access - publisher's policyBarrier synchronisation is a widely-studied topic since the supercomputer era due to its significant impact on the overall performance of parallel applications. With the current shift to many-core architectures, such as ... -
Analyzing the impact of communication imbalance in high-speed networks
Utrera Iglesias, Gladys Miriam; Gil, Marisa; Martorell Bofill, Xavier (2017-12-21)
Article
Open AccessIn this work we analyze the communication load imbalance generated by irregular-data applications running in a multi-node cluster. Experimental approaches to diminish communication load imbalance are evaluated using a ... -
Analyzing the performance of hierarchical collective algorithms on ARM-based multicore clusters
Utrera Iglesias, Gladys Miriam; Gil, Marisa; Martorell Bofill, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2022)
Conference lecture
Open AccessMPI is the de facto communication standard library for parallel applications in distributed memory architectures. Collective operations performance is critical in HPC applications as they can become the bottleneck of their ... -
Application acceleration on FPGAs with OmpSs@FPGA
Bosch, Jaume; Tan, Xubin; Filgueras Izquierdo, Antonio; Vidal, Miquel; Mateu, Marc; Jiménez-González, Daniel; Álvarez, Carlos; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2019)
Conference report
Open AccessOmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous ... -
Applying interposition techniques for performance analysis of OPENMP parallel applications
González Tallada, Marc; Serra, Albert; Martorell Bofill, Xavier; Oliver Segura, José; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Navarro, Nacho (Institute of Electrical and Electronics Engineers (IEEE), 2000)
Conference report
Open AccessTuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may arise. We believe that ... -
Asynchronous runtime with distributed manager for task-based programming models
Bosch Pons, Jaume; Álvarez Martínez, Carlos; Jiménez González, Daniel; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (2020-09)
Article
Open AccessParallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of ... -
Auto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systems
Pérez, Borja; Stafford, Esteban; Bosque Orero, José Luis; Beivide Palacio, Ramon; Mateo Bellido, Sergi; Teruel García, Xavier; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (2019-03-01)
Article
Open AccessThe emergence of heterogeneous systems has been very notable recently. The nodes of the most powerful computers integrate several compute accelerators, like GPUs. Profiting from such node configurations is not a trivial ... -
Automated parallel execution of distributed task graphs with FPGA clusters
Haro Ruiz, Juan Miguel de; Álvarez Martínez, Carlos; Jiménez González, Daniel; Martorell Bofill, Xavier; Ueno, Tomohiro; Sano, Kentaro; Ringlein, Burkhard; Abel, François; Weiss, Beat (Elsevier, 2024-11)
Article
Open AccessOver the years, Field Programmable Gate Arrays (FPGA) have been gaining popularity in the High Performance Computing (HPC) field, because their reconfigurability enables very fine-grained optimizations with low energy cost. ...