Ara es mostren els items 90-109 de 217

    • Identifying code phases using piece-wise linear regressions 

      Servat, Harald; Llort Sánchez, Germán; González García, Juan; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Node-level performance is one of the factors that may limit applications from reaching the supercomputers' peak performance. Studying node-level performance and attributing it to the source code results into valuable insight ...
    • Identifying critical code sections in dataflow programming models 

      Subotic, Vladimir; Sancho, Jose Carlos; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The years of practice in optimizing applications point that the major issue is focus - identifying the critical code section whose optimization would yield the highest overall speedup. While this issue is mainly solved for ...
    • Impact of the memory hierarchy on shared memory architectures in multicore programming models 

      Badia Sala, Rosa Maria; Pérez Cáncer, Josep Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (IEEE Computer Society Publications, 2009)
      Comunicació de congrés
      Accés obert
      Many and multicore architectures put a big pressure in parallel programming but gives a unique opportunity to propose new programming models that automatically exploit the parallelism of these architectures. OpenMP is a ...
    • Implementing OmpSs support for regions of data in architectures with multiple address spaces 

      Bueno Hedo, Javier; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (ACM, 2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The need for features for managing complex data accesses in modern programming models has increased due to the emerging hardware architectures. HPC hardware has moved towards clusters of accelerators and/or multicores, ...
    • Improving the integration of task nesting and dependencies in OpenMP 

      Pérez, Josep M.; Beltran Querol, Vicenç; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés obert
      The tasking model of OpenMP 4.0 supports both nesting and the definition of dependences between sibling tasks. A natural way to parallelize many codes with tasks is to first taskify the high-level functions and then to ...
    • Improving the interoperability between MPI and task-based programming models 

      Sala Penadés, Kevin; Bellón, Jorge; Farré, Pau; Teruel, Xavier; Pérez, Josep M.; Peña, Antonio J.; Holmes, Daniel; Beltran Querol, Vicenç; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2018)
      Text en actes de congrés
      Accés obert
      In this paper we propose an API to pause and resume task execution depending on external events. We leverage this generic API to improve the interoperability between MPI synchronous communication primitives and tasks. When ...
    • Instrumentation environment for Java threaded applications 

      Guitart Fernández, Jordi; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Oliver, Jose; Labarta Mancho, Jesús José (XI Jornadas de Paralelismo, 2000)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The rapid maturing process of the Java technology is encouraging users to develope of portable applications using the Java language. As an important part of the definition of the Java language, the use of threads is becoming ...
    • Integrating blocking and non-blocking MPI primitives with task-based programming models 

      Sala Penadés, Kevin; Teruel García, Xavier; Pérez Cáncer, Josep Maria; Peña, Antonio J.; Beltran, Vicenç; Labarta Mancho, Jesús José (2019-07)
      Article
      Accés obert
      In this paper we present the Task-Aware MPI library (TAMPI) that integrates both blocking and non-blocking MPI primitives with task-based programming models. The TAMPI library leverages two new runtime APIs to improve both ...
    • Integrating memory perspective into the BSC performance tools 

      Servat, Harald; Labarta Mancho, Jesús José; Hoppe, Hans-Christian; Gimenez, Judit; Peña, Antonio J. (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés obert
      The growing gap between processor and memory speeds results in complex memory hierarchies as processors evolve to mitigate such differences by taking advantage of locality of reference. In this direction, the BSC performance ...
    • Is the schedule clause really necessary in OpenMP? 

      Ayguadé Parra, Eduard; Blainey, Bob; Duran González, Alejandro; Labarta Mancho, Jesús José; Martínez, Francisco; Martorell Bofill, Xavier; Silvera, RaulI (2003-06)
      Article
      Accés obert
      Choosing the appropriate assignment of loop iterations to threads is one of the most important decisions that need to be taken when parallelizing Loops, the main source of parallelism in numerical applications. This is not ...
    • Java instrumentation suite: accurate analysis of Java threaded applications 

      Guitart Fernández, Jordi; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Oliver, Jose; Labarta Mancho, Jesús José (2000)
      Text en actes de congrés
      Accés obert
      The rapid maturing process of the Java technology is encouraging users the development of portable applications using the Java language. As an important part of the definition of the Java language, the use of threads is ...
    • Just-in-time renaming and lazy write-back on the Cell/B.E. 

      Bellens, Pieter; Pérez Cáncer, Josep Maria; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2009)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Cell Superscalar (CellSs) provides a simple, flexible and easy programming approach for the Cell Broadband Engine (Cell/B.E.) that automatically exploits the inherent concurrency of applications at a function or task level. ...
    • Kernel-level scheduling for the nano-threads programming model 

      Polychronopoulos, Eleftherios D.; Martorell Bofill, Xavier; Nikolopoulos, Dimitrios S.; Labarta Mancho, Jesús José; Papatheodorou, Theodore S.; Navarro, Nacho (Associaton for Computing Machinery (ACM), 1998)
      Text en actes de congrés
      Accés obert
      Multiprocessor systems are increasingly becoming the sys- tems of choice for low and high-end servers, running such diverse tasks as number crunching, large-scale simulations, data base engines and world wide web server ...
    • Loop parallelization: revisiting framework of unimodular transformations 

      Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 1996)
      Text en actes de congrés
      Accés obert
      The paper extends the framework of linear loop transformations adding a new nonlinear step at the transformation process. The current framework of linear loop transformation cannot identify a significant fraction of ...
    • MACC: Mercurium ACCelerator Model 

      Ozen, Guray; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Barcelona Supercomputing Center, 2015-05-05)
      Text en actes de congrés
      Accés obert
      GPU Offloading is emergent programming model. OpenMP includes in its latest 4.0 specification the accelerator model. In this paper we present a newly implementation of this specification while generationg "native" GPU ...
    • MetH: A family of high-resolution and variable-shape image challenges 

      Parés Pont, Ferran; Garcia Gasulla, Dario; Servat, Harald; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (2019-11-20)
      Report de recerca
      Accés obert
      High-resolution and variable-shape images have not yet been properly addressed by the AI community. The approach of down-sampling data often used with convolutional neural networks is sub-optimal for many tasks, and has ...
    • Methodology to predict scalability of parallel applications 

      Rosas, Claudia; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Barcelona Supercomputing Center, 2015-05-05)
      Text en actes de congrés
      Accés obert
      In the road to exascale computing, the inference of expected performance of parallel applications results in a complex task. Performance analysts need to identify the behavior of the applications and to extrapolate it to ...
    • MPI+OpenMP tasking scalability for multi-morphology simulations of the human brain 

      Valero-Lara, Pedro; Sirvent, Raül; Peña, Antonio J.; Labarta Mancho, Jesús José (2019-05)
      Article
      Accés obert
      The simulation of the behavior of the human brain is one of the most ambitious challenges today with a non-end of important applications. We can find many different initiatives in the USA, Europe and Japan which attempt ...
    • MPI+OpenMP tasking scalability for the simulation of the human brain 

      Valero-Lara, Pedro; Sirvent, Raul; Pena, A. J.; Martorell Bofill, Xavier; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2018)
      Text en actes de congrés
      Accés obert
      The simulation of the behavior of the Human Brain is one of the most ambitious challenges today with a non-end of important applications. We can find many different initiatives in the USA, Europe and Japan which attempt ...
    • MPI+X: task-based parallelisation and dynamic load balance of finite element assembly 

      Garcia, Marta; Houzeaux, Guillaume; Ferrer, Roger; Artigues, Antoni; López, Victor; Labarta Mancho, Jesús José; Vázquez, Mariano (Taylor & Francis, 2019-05)
      Article
      Accés obert
      The main computing phases of numerical methods for solving partial differential equations are the algebraic system assembly and the iterative solver. This work focuses on the first task, in the context of a hybrid MPI+X ...