Ara es mostren els items 104-123 de 217

    • MACC: Mercurium ACCelerator Model 

      Ozen, Guray; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Barcelona Supercomputing Center, 2015-05-05)
      Text en actes de congrés
      Accés obert
      GPU Offloading is emergent programming model. OpenMP includes in its latest 4.0 specification the accelerator model. In this paper we present a newly implementation of this specification while generationg "native" GPU ...
    • MetH: A family of high-resolution and variable-shape image challenges 

      Parés Pont, Ferran; Garcia Gasulla, Dario; Servat, Harald; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (2019-11-20)
      Report de recerca
      Accés obert
      High-resolution and variable-shape images have not yet been properly addressed by the AI community. The approach of down-sampling data often used with convolutional neural networks is sub-optimal for many tasks, and has ...
    • Methodology to predict scalability of parallel applications 

      Rosas, Claudia; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Barcelona Supercomputing Center, 2015-05-05)
      Text en actes de congrés
      Accés obert
      In the road to exascale computing, the inference of expected performance of parallel applications results in a complex task. Performance analysts need to identify the behavior of the applications and to extrapolate it to ...
    • MPI+OpenMP tasking scalability for multi-morphology simulations of the human brain 

      Valero-Lara, Pedro; Sirvent, Raül; Peña, Antonio J.; Labarta Mancho, Jesús José (2019-05)
      Article
      Accés obert
      The simulation of the behavior of the human brain is one of the most ambitious challenges today with a non-end of important applications. We can find many different initiatives in the USA, Europe and Japan which attempt ...
    • MPI+OpenMP tasking scalability for the simulation of the human brain 

      Valero-Lara, Pedro; Sirvent, Raul; Pena, A. J.; Martorell Bofill, Xavier; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2018)
      Text en actes de congrés
      Accés obert
      The simulation of the behavior of the Human Brain is one of the most ambitious challenges today with a non-end of important applications. We can find many different initiatives in the USA, Europe and Japan which attempt ...
    • MPI+X: task-based parallelisation and dynamic load balance of finite element assembly 

      Garcia, Marta; Houzeaux, Guillaume; Ferrer, Roger; Artigues, Antoni; López, Victor; Labarta Mancho, Jesús José; Vázquez, Mariano (Taylor & Francis, 2019-05)
      Article
      Accés obert
      The main computing phases of numerical methods for solving partial differential equations are the algebraic system assembly and the iterative solver. This work focuses on the first task, in the context of a hybrid MPI+X ...
    • MPI+X: task-based parallelization and dynamic load balance of finite element assembly 

      Garcia-Gasulla, Marta; Houzeaux, Guillaume; Ferrer, Roger; Artigues, Antoni; López, Victor; Labarta Mancho, Jesús José; Vázquez, Mariano (Taylor & Francis, 2018)
      Article
      Accés obert
      The main computing tasks of a finite element code(FE) for solving partial differential equations (PDE's) are the algebraic system assembly and the iterative solver. This work focuses on the first task, in the context of ...
    • Multiple target task sharing support for the OpenMP accelerator model 

      Ozen, Guray; Mateo, Sergi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Beyer, James B. (Springer, 2016)
      Text en actes de congrés
      Accés obert
      The use of GPU accelerators is becoming common in HPC platforms due to the their effective performance and energy efficiency. In addition, new generations of multicore processors are being designed with wider vector units ...
    • MUSA: a multi-level simulation approach for next-generation HPC machines 

      Grass, Thomas; Allande, César; Armejach, Adrià; Rico, Alejandro; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo; Casas, Marc; Moretó Planas, Miquel (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Text en actes de congrés
      Accés obert
      The complexity of High Performance Computing (HPC) systems is increasing in the number of components and their heterogeneity. Interactions between software and hardware involve many different aspects which are typically ...
    • NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP 

      González Tallada, Marc; Ayguadé Parra, Eduard; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Navarro, Nacho; Oliver Segura, José (2000-10)
      Article
      Accés restringit per política de l'editorial
      This paper describes the support provided by the NanosCompiler to nested parallelism in OpenMP. The NanosCompiler is a source-to-source parallelizing compiler implemented around a hierarchical internal program representation ...
    • New OpenMP directives for irregular data access loops 

      Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Oliver Segura, José; Henty, David (2002-03)
      Article
      Accés restringit per política de l'editorial
      Many scientific applications involve array operations that are sparse in nature, ie array elements depend on the values of relatively few elements of the same or another array. When parallelised in the shared-memory model, ...
    • Noise inspector tool 

      Utrera Iglesias, Gladys Miriam; Fornés de Juan, Jordi; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés obert
      The operating system noise can interfere with normal execution programs. This behavior is becoming especially important when scaling parallel programs and amplified with global synchronizations. This work presents a tool ...
    • O(n) key–value sort with active compute memory 

      Esmaili Dokht, Pouya; Guiot Cusido, Miquel; Radojkovic, Petar; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Adlard, Jason; Amato, Paolo; Sforzin, Marco (Institute of Electrical and Electronics Engineers (IEEE), 2024-05)
      Article
      Accés obert
      We propose the Active Compute Memory (ACM), a near-memory-processing architecture capable of performing key–value sort directly in the DRAM. In the ACM architecture, sort is merely the writing of data into memory with one ...
    • Oblivious routing schemes in extended generalized fat tree networks 

      Rodríguez Herrera, Germán; Minkenberg, Cyriel; Beivide Palacio, Ramon; Luijten, Ronald P.; Labarta Mancho, Jesús José; Valero Cortés, Mateo (IEEE Computational Intelligence Society, 2009)
      Text en actes de congrés
      Accés obert
      A family of oblivious routing schemes for fat trees and their slimmed versions is presented in this work. First, two popular oblivious routing algorithms, which we refer to as S-mod-k and D-mod-k, are analyzed in detail. ...
    • OmpSs-2@Cluster: Distributed memory execution of nested OpenMP-style tasks 

      Aguilar Mena, Jimmy; Ali, Omar Shaaban Ibrahim; Beltran Querol, Vicenç; Carpenter, Paul Matthew; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer Nature, 2022)
      Text en actes de congrés
      Accés obert
      State-of-the-art programming approaches generally have a strict division between intra-node shared memory parallelism and inter-node MPI communication. Tasking with dependencies offers a clean, dependable abstraction for ...
    • OmpSs@cloudFPGA: An FPGA task-based programming model with message passing 

      Haro Ruiz, Juan Miguel de; Cano, Rubén; Álvarez Martínez, Carlos; Jiménez González, Daniel; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Abel, François; Ringlein, Burkhard; Weiss, Beat (Institute of Electrical and Electronics Engineers (IEEE), 2022)
      Text en actes de congrés
      Accés obert
      Nowadays, a new parallel paradigm for energy-efficient heterogeneous hardware infrastructures is required to achieve better performance at a reasonable cost on high-performance computing applications. Under this new paradigm, ...
    • OmpSs@FPGA framework for high performance FPGA computing 

      Haro Ruiz, Juan Miguel de; Bosch Pons, Jaume; Filgueras Izquierdo, Antonio; Vidal, Miquel; Jiménez González, Daniel; Álvarez Martínez, Carlos; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2021-12-01)
      Article
      Accés obert
      This paper presents the new features of the OmpSs@FPGA framework. OmpSs is a data-flow programming model that supports task nesting and dependencies to target asynchronous parallelism and heterogeneity. OmpSs@FPGA is the ...
    • On automatic loop data-mapping for distributed-memory multiprocessors 

      Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Llaberia Griñó, José M.; Valero Cortés, Mateo (Springer, 1991)
      Text en actes de congrés
      Accés obert
      In this paper we present a unified approach for compiling programs for Distributed-Memory Multiprocessors (DMM). Parallelization of sequential programs for DMM is much more difficult to achieve than for shared memory systems ...
    • On the behavior of convolutional nets for feature extraction 

      Garcia-Gasulla, Dario; Parés Pont, Ferran; Vilalta Arias, Armand; Moreno, Jonatan; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Cortés García, Claudio Ulises; Suzumura, Toyotaro (2018-03)
      Article
      Accés obert
      Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive ...
    • On the instrumentation of OpenMP and OmpSs Tasking constructs 

      Servat, Harald; Teruel, Xavier; Llort Sánchez, Germán; Duran González, Alejandro; Giménez, J.; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2012)
      Text en actes de congrés
      Accés obert
      Parallelism has become more and more commonplace with the advent of the multicore processors. Although different parallel pro- gramming models have arisen to exploit the computing capabilities of such processors, ...