Ara es mostren els items 6-17 de 17

    • Evaluation of memory performance on the cell BE with the SARC programming model 

      Ferrer, Roger; González Tallada, Marc; Silla, Federico; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2008)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      With the advent of multicore architectures, especially with the heterogeneous ones, both computational and memory top performance are difficult to obtain using traditional programming models. Usually, programmers have to ...
    • Extending OpenMP to survive the heterogeneous multi-core era 

      Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Bellens, Pieter; Cabrera, Daniel; Duran González, Alejandro; Ferrer, Roger; González Tallada, Marc; Igual Peña, Francisco D.; Jiménez González, Daniel; Labarta Mancho, Jesús José; Martinell Andreu, Luis; Martorell Bofill, Xavier; Mayo Gual, Rafael; Pérez Cáncer, Josep Maria; Planas, Judit; Quintana Ortí, Enrique Salvador (2010-10)
      Article
      Accés restringit per política de l'editorial
    • How can we improve energy efficiency through user-directed vectorization and task-based parallelization? 

      Caminal, Helena; Caballero, Diego; Cebrián, Juan M.; Ferrer, Roger; Casas, Marc; Moretó Planas, Miquel; Martorell Bofill, Xavier; Valero Cortés, Mateo (Barcelona Supercomputing Center, 2015-05-05)
      Text en actes de congrés
      Accés obert
      Heterogeneity, parallelization and vectorization are key techniques to improve the performance and energy efficiency of modern computing systems. However, programming and maintaining code for these architectures poses a ...
    • MPI+X: task-based parallelisation and dynamic load balance of finite element assembly 

      Garcia, Marta; Houzeaux, Guillaume; Ferrer, Roger; Artigues, Antoni; López, Victor; Labarta Mancho, Jesús José; Vázquez, Mariano (Taylor & Francis, 2019-05)
      Article
      Accés obert
      The main computing phases of numerical methods for solving partial differential equations are the algebraic system assembly and the iterative solver. This work focuses on the first task, in the context of a hybrid MPI+X ...
    • MPI+X: task-based parallelization and dynamic load balance of finite element assembly 

      Garcia-Gasulla, Marta; Houzeaux, Guillaume; Ferrer, Roger; Artigues, Antoni; López, Victor; Labarta Mancho, Jesús José; Vázquez, Mariano (Taylor & Francis, 2018)
      Article
      Accés obert
      The main computing tasks of a finite element code(FE) for solving partial differential equations (PDE's) are the algebraic system assembly and the iterative solver. This work focuses on the first task, in the context of ...
    • Nebelung: execution environment for transactional OpenMP 

      Milovanovic, M; Ferrer, Roger; Gajinov, Vladimir; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (2008-06)
      Article
      Accés restringit per política de l'editorial
      Future generations of Chip Multiprocessors (CMP) will provide dozens or even hundreds of cores inside the chip. Writing applications that benefit from the massive computational power offered by these chips is not going to ...
    • Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL 

      Ferrer, Roger; Planas Carbonell, Judit; Bellens, Pieter; Duran González, Alejandro; González Tallada, Marc; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2011)
      Article
      Accés restringit per política de l'editorial
      In this paper, we present OMPSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on three different architectures, SMP, Cell/B.E. and ...
    • Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL 

      Ferrer, Roger; Planas Carbonell, Judit; Bellens, Pieter; Duran Gonzalez, Alejandro; González Tallada, Marc; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2010)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      In this paper, we present OMPSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on three different architectures, SMP, Cell/B.E. and ...
    • Performance and energy effects on task-based parallelized applications: User-directed versus manual vectorization 

      Caminal Pallarés, Helena; Caballero de Gea, Diego; Cebrián González, Juan Manuel; Ferrer, Roger; Casas, Marc; Moretó Planas, Miquel; Martorell Bofill, Xavier; Valero Cortés, Mateo (2018-06)
      Article
      Accés obert
      Heterogeneity, parallelization and vectorization are key techniques to improve the performance and energy efficiency of modern computing systems. However, programming and maintaining code for these architectures poses a ...
    • Practical experience with Nebelung: the runtime support for transactional memory and OpenMP 

      Milovanovic, Milos; Ferrer, Roger; Gajinov, Vladimir; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (2007-08)
      Report de recerca
      Accés obert
      Transactional Memory (TM) is a key future technology for emerging many-cores. On the other hand, OpenMP provides a vast established base for writing parallel programs, especially for scientific applications. Combining TM ...
    • Static analysis to enhance programmability and performance in OmpSs-2 

      Munera, Adrian; Royuela Alcázar, Sara; Ferrer, Roger; Peñacoba, Raul; Quiñones, Eduardo (Springer, Cham, 2020)
      Text en actes de congrés
      Accés obert
      Task-based parallel programming models based on compiler directives have proved their effectiveness at describing parallelism in High-Performance Computing (HPC) applications. Recent studies show that cutting-edge Real-Time ...
    • Transactional memory and OpenMp 

      Milovanovic, Milos; Ferrer, Roger; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2007-06)
      Article
      Accés restringit per política de l'editorial
      Future generations of Chip Multiprocessors (CMP) will provide dozens or even hundreds of cores inside the chip. Writing applications that benefit from the massive computational power offered by these chips is not going to ...