Browsing by Author "González Tallada, Marc"
Now showing items 1-20 of 32
-
A novel asynchronous software cache implementation for the Cell-BE processor
Balart, J; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Sura, Z; Chen, T; Zhang, T; O'Brien, Kevin; O'Brien, Kathryn (2008-10)
Article
Restricted access - publisher's policyThis paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several services that allow the compiler to generate ... -
A proposal for error handling in OpenMP
Duran González, Alejandro; Ferrer, Roger; Costa Prats, Juan José; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2008)
Article
Restricted access - publisher's policyOpenMP has been focused in performance applied to numerical applications, but when we try to move this focus to other kind of applications, like Web servers, we detect one important lack. In these applications, performance ... -
Abisko: Deep codesign of an architecture for spiking neural networks using novel neuromorphic materials
Vetter, Jeffrey; Date, Prasanna; Fahim, Farah; Kulkarni, Shruti R.; Maksymovych, Petro; Talin, Alec; González Tallada, Marc; Vanna Iampikul, Pruek; Young, Aaron Reed; Brooks, David (SAGE publishing, 2023-07)
Article
Open AccessThe Abisko project aims to develop an energy-efficient spiking neural network (SNN) computing architecture and software system capable of autonomous learning and operation. The SNN architecture explores novel neuromorphic ... -
Achieving high memory performance from heterogeneous architectures with the SARC programming model
Ferrer, Roger; Beltran Querol, Vicenç; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (ACM, 2009)
Conference lecture
Restricted access - publisher's policyCurrent heterogeneous multicore architectures, including the Cell/B.E., GPUs, and future developments, like Larrabee, require enormous programming efforts to efficiently run current parallel applications, achieving high ... -
Applying interposition techniques for performance analysis of OPENMP parallel applications
González Tallada, Marc; Serra, Albert; Martorell Bofill, Xavier; Oliver Segura, José; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Navarro, Nacho (Institute of Electrical and Electronics Engineers (IEEE), 2000)
Conference report
Open AccessTuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may arise. We believe that ... -
Automatic multilevel parallelization using OpenMP
Jin, H; Jost, G; Yan, J; Ayguadé Parra, Eduard; González Tallada, Marc; Martorell Bofill, Xavier (2004-06)
Article
Restricted access - publisher's policyIn this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler ... -
Automatic pre-fetch and modulo scheduling transformations for the cell BE architecture
Vujic, N; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (2008-01)
Article
Restricted access - publisher's policyEase of programming is one of the main impediments for the broad acceptance of multi-core systems with no hardware support for transparent data transfer between local and global memories. Software cache is a robust approach ... -
Coarse grain parallelization of deep neural networks
González Tallada, Marc (Institute of Electrical and Electronics Engineers (IEEE), 2016)
Conference lecture
Restricted access - publisher's policyDeep neural networks (DNN) have recently achieved extraordinary results in domains like computer vision and speech recognition. An essential element for this success has been the introduction of high performance computing ... -
Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures
Álvarez Martí, Lluc; Vilanova, Lluís; Moretó Planas, Miquel; Casas, Marc; González Tallada, Marc; Martorell Bofill, Xavier; Navarro, Nacho; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2015)
Conference report
Open AccessThe increasing number of cores in manycore architectures causes important power and scalability problems in the memory subsystem. One solution is to introduce scratchpad memories alongside the cache hierarchy, forming a ... -
Complex pipelined executions in OpenMP parallel applications
González Tallada, Marc; Ayguadé Parra, Eduard; Martorell Bofill, Xavier; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2001)
Conference report
Open AccessThis paper proposes a set of extensions to the OpenMP programming model to express complex pipelined computations. This is accomplished by defining, in the form of directives, precedence relations among the tasks originated ... -
Compute units in OpenMP: extensions for heterogeneous parallel programming
González Tallada, Marc; Morancho Llena, Enrique (John Wiley & sons, 2024-01-10)
Article
Open AccessThis article evaluates the current support for heterogeneous OpenMP 5.2 applications regarding the simultaneous activation of host and device computing units (e.g., CPUs, GPUs, or FPGAs). The article identifies limitations ... -
Dual-level parallelism exploitation with OpenMP in coastal ocean circulation modeling
González Tallada, Marc; Ayguadé Parra, Eduard; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Luong, P V (2002-05)
Article
Open AccessTwo alternative dual-level parallel implementations of the Multiblock Grid Princeton Ocean Model (MGPOM) are compared in this paper. The first one combines the use of two programming paradigms: message passing with the ... -
Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications
Ayguadé Parra, Eduard; González Tallada, Marc; Martorell Bofill, Xavier; Jost, G (2006-05)
Article
Restricted access - publisher's policyIn this paper we describe the parallelization of the multi-zone code versions of the NAS Parallel Benchmarks employing multi-level OpenMP parallelism. For our study, we use the NanosCompiler that supports nesting of OpenMP ... -
Enhancing Kokkos with OpenACC
Valero Lara, Pedro; Lee, Seyong; González Tallada, Marc; Denny, Joel; Teranishi, Keita; Vetter, Jeffrey (SAGE publishing, 2024-06-13)
Article
Open AccessC++ template metaprogramming has emerged as a prominent approach for achieving performance portability in heterogeneous computing. Kokkos represents a notable paradigm in this domain, offering programmers a suite of ... -
Evaluation of memory performance on the cell BE with the SARC programming model
Ferrer, Roger; González Tallada, Marc; Silla, Federico; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2008)
Conference lecture
Restricted access - publisher's policyWith the advent of multicore architectures, especially with the heterogeneous ones, both computational and memory top performance are difficult to obtain using traditional programming models. Usually, programmers have to ... -
Evaluation of work distribution schedulers for heterogeneous architectures and scientific applications
González Tallada, Marc; Morancho Llena, Enrique (Frontiers Media SA, 2024-12-10)
Article
Open AccessThis article explores and evaluates variants of state-of-the-art work distribution schemes adapted for scientific applications running on hybrid systems. A hybrid implementation (multi-GPU and multi-CPU) of the NASA Parallel ... -
Experiences parallelizing a web server with OpenMP
Balart Tarzan, Jairo; Duran González, Alejandro; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2008)
Article
Open AccessMulti-threaded web servers are typically parallelized by hand using the pthreads library. OpenMP has rarely been used to parallelize such kind of applications, although we foresee that it can be a great tool for network ... -
Exploiting pipelined executions in OpenMP
González Tallada, Marc; Ayguadé Parra, Eduard; Martorell Bofill, Xavier; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2003)
Conference report
Open AccessWe propose a set of extensions to the OpenMP programming model to express point-to-point synchronisation schemes. This is accomplished by defining, in the form of directives, precedence relations among the tasks that are ... -
Extending OpenMP to survive the heterogeneous multi-core era
Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Bellens, Pieter; Cabrera, Daniel; Duran González, Alejandro; Ferrer, Roger; González Tallada, Marc; Igual Peña, Francisco D.; Jiménez González, Daniel; Labarta Mancho, Jesús José; Martinell Andreu, Luis; Martorell Bofill, Xavier; Mayo Gual, Rafael; Pérez Cáncer, Josep Maria; Planas, Judit; Quintana Ortí, Enrique Salvador (2010-10)
Article
Restricted access - publisher's policy -
Hardware-software coherence protocol for the coexistence of caches and local memories
Álvarez Martí, Lluc; Vilanova, Lluís; González Tallada, Marc; Martorell Bofill, Xavier; Navarro, Nacho; Ayguadé Parra, Eduard (2015-01-01)
Article
Open AccessCache coherence protocols limit the scalability of multicore and manycore architectures and are responsible for an important amount of the power consumed in the chip. A good way to alleviate these problems is to introduce ...