Ara es mostren els items 1-20 de 21

    • A Review of Lightweight Thread Approaches for High Performance Computing 

      Castelló, Adrián; Peña, Antonio J.; Seo, Sangmin; Mayo, Rafael; Balaji, Pavan; Quintana Ortí, Enrique Salvador (IEEE, 2016-12-08)
      Text en actes de congrés
      Accés obert
      High-level, directive-based solutions are becoming the programming models (PMs) of the multi/many-core architectures. Several solutions relying on operating system (OS) threads perfectly work with a moderate number of ...
    • Automating the application data placement in hybrid memory systems 

      Servat, Harald; Peña, Antonio J.; Llort, German; Mercadal, Estanislao; Hoppe, Hans-Christian; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés obert
      Multi-tiered memory systems, such as those based on Intel® Xeon Phi™processors, are equipped with several memory tiers with different characteristics including, among others, capacity, access latency, bandwidth, energy ...
    • cuHinesBatch: solving multiple hines systems on GPUs Human Brain Project 

      Valero-Lara, Pedro; Martinez-Perez, Ivan; Peña, Antonio J.; Martorell Bofill, Xavier; Sirvent, Raul; Labarta Mancho, Jesús José (Elsevier, 2017)
      Article
      Accés obert
      The simulation of the behavior of the Human Brain is one of the most important challenges today in computing. The main problem consists of finding efficient ways to manipulate and compute the huge volume of data that this ...
    • cuThomasBatch and cuThomasVBatch, CUDA Routines to compute batch of tridiagonal systems on NVIDIA GPUs 

      Valero Lara, Pedro; Martinez Pérez, Ivan; Sirvent, Raül; Martorell Bofill, Xavier; Peña, Antonio J. (Wiley, 2018-01-01)
      Article
      Accés obert
      The solving of tridiagonal systems is one of the most computationally expensive parts in many applications, so that multiple studies have explored the use of NVIDIA GPUs to accelerate such computation. However, these studies ...
    • Dynamic Adaptable Asynchronous Progress Model for MPI RMA Multiphase Applications 

      Si, Min; Peña, Antonio J.; Hammond, Jeff; Balaji, Pavan; Takagi, Masamichi; Ishikawa, Yutaka (IEEE, 2018-09-01)
      Article
      Accés obert
      Casper is a process-based asynchronous progress model for MPI one-sided communication on multi- and many-core architectures. The one-sided communication is not truly one-sided in most MPI implementations: the target process ...
    • Efficient data sharing on heterogeneous systems 

      García-Flores, Víctor; Ayguadé Parra, Eduard; Peña, Antonio J. (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      General-purpose computing on GPUs has become more accessible due to features such as shared virtual memory and demand paging. Unfortunately it comes at a price, and that is performance. Automatic memory management is ...
    • Efficient Scalable Computing through Flexible Applications and Adaptive Workloads 

      Iserte, Sergio; Mayo, Rafael; Quintana Ortí, Enrique Salvador; Beltran Querol, Vicenç; Peña, Antonio J. (IEEE, 2017-09-07)
      Comunicació de congrés
      Accés obert
      In this paper we introduce a methodology for dynamic job reconfiguration driven by the programming model runtime in collaboration with the global resource manager. We improve the system throughput by exploiting malleability ...
    • Enabling CUDA acceleration within virtual machines using rCUDA 

      Duato, José; Peña, Antonio J.; Silla, Federico; Fernández, Juan C.; Mayo, Rafael; Quintana Ortí, Enrique Salvador (IEEE, 2012-02-16)
      Comunicació de congrés
      Accés obert
      The hardware and software advances of Graphics Processing Units (GPUs) have favored the development of GPGPU (General-Purpose Computation on GPUs) and its adoption in many scientific, engineering, and industrial areas. ...
    • Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models 

      Castelló, Adrian; Peña, Antonio J.; Mayo, Rafael; Planas, Judit; Quintana Ortí, Enrique Salvador; Balaji, Pavan (Springer US, 2016-06-21)
      Article
      Accés obert
      Directive-based programming models, such as OpenMP, OpenACC, and OmpSs, enable users to accelerate applications by using coprocessors with little effort. These devices offer significant computing power, but their use can ...
    • Exploring the Vision Processing Unit as Co-Processor for Inference 

      Rivas-Gomez, Sergio; Peña, Antonio J.; Moloney, David; Laure, Erwin; Markidis, Stefano (IEEE, 2018-08-06)
      Comunicació de congrés
      Accés obert
      The success of the exascale supercomputer is largely debated to remain dependent on novel breakthroughs in technology that effectively reduce the power consumption and thermal dissipation requirements. In this work, we ...
    • GLT: A Unified API for Lightweight Thread Libraries 

      Castelló, Adrián; Seo, Sangmin; Mayo, Rafael; Balaji, Pavan; Quintana Ortí, Enrique Salvador; Peña, Antonio J. (Springer, 2017-08)
      Text en actes de congrés
      Accés obert
      In recent years, several lightweight thread (LWT) libraries have emerged to tackle exascale challenges. These offer programming models (PMs) based on user-level threads and incorporate their own lightweight mechanisms. ...
    • GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations 

      Castelló, Adrián; Mayo, Rafael; Quintana Ortí, Enrique Salvador; Seo, Sangmin; Balaji, Pavan; Peña, Antonio J. (IEEE, 2017-09-07)
      Comunicació de congrés
      Accés obert
      OpenMP is the de facto standard application programming interface (API) for on-node parallelism. The most popular OpenMP runtimes rely on POSIX threads (pthreads) implementations that offer an excellent performance for ...
    • Improving the interoperability between MPI and task-based programming models 

      Sala Penadés, Kevin; Bellón, Jorge; Farré, Pau; Teruel, Xavier; Pérez, Josep M.; Peña, Antonio J.; Holmes, Daniel; Beltran Querol, Vicenç; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2018)
      Text en actes de congrés
      Accés obert
      In this paper we propose an API to pause and resume task execution depending on external events. We leverage this generic API to improve the interoperability between MPI synchronous communication primitives and tasks. When ...
    • Integrating blocking and non-blocking MPI primitives with task-based programming models 

      Sala Penadés, Kevin; Teruel García, Xavier; Pérez Cáncer, Josep Maria; Peña, Antonio J.; Beltran, Vicenç; Labarta Mancho, Jesús José (2019-07)
      Article
      Accés obert
      In this paper we present the Task-Aware MPI library (TAMPI) that integrates both blocking and non-blocking MPI primitives with task-based programming models. The TAMPI library leverages two new runtime APIs to improve both ...
    • Integrating memory perspective into the BSC performance tools 

      Servat, Harald; Labarta Mancho, Jesús José; Hoppe, Hans-Christian; Gimenez, Judit; Peña, Antonio J. (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés obert
      The growing gap between processor and memory speeds results in complex memory hierarchies as processors evolve to mitigate such differences by taking advantage of locality of reference. In this direction, the BSC performance ...
    • MPI+OpenMP tasking scalability for multi-morphology simulations of the human brain 

      Valero-Lara, Pedro; Sirvent, Raül; Peña, Antonio J.; Labarta Mancho, Jesús José (2019-05)
      Article
      Accés obert
      The simulation of the behavior of the human brain is one of the most ambitious challenges today with a non-end of important applications. We can find many different initiatives in the USA, Europe and Japan which attempt ...
    • On the adequacy of lightweight thread approaches for high-level parallel programming models 

      Castelló, Adrián; Mayo Gual, Rafael; Sala Penadés, Kevin; Beltran Querol, Vicenç; Balaji, Pavan; Peña, Antonio J. (Elsevier, 2018-07)
      Article
      Accés obert
      High-level parallel programming models (PMs) are becoming crucial in order to extract the computational power of current on-node multi-threaded parallelism. The most popular PMs, such as OpenMP or OmpSs, are directive-based: ...
    • Simulating the behavior of the human brain on GPUS 

      Valero-Lara, Pedro; Martinez-Perez, Ivan; Sirvent, Raul; Peña, Antonio J.; Martorell Bofill, Xavier; Labarta Mancho, Jesús José (2018-01-01)
      Article
      Accés obert
      The simulation of the behavior of the Human Brain is one of the most important challenges in computing today. The main problem consists of finding efficient ways to manipulate and compute the huge volume of data that this ...
    • Supporting automatic recovery in offloaded distributed programming models through MPI-3 techniques 

      Peña, Antonio J.; Beltran Querol, Vicenç; Clauss, Carsten; Moschny, Thomas (ACM Digital Library, 2017-06-15)
      Comunicació de congrés
      Accés obert
      In this paper we describe the design of fault tolerance capabilities for general-purpose offload semantics, based on the OmpSs programming model. Using ParaStation MPI, a production MPI-3.1 implementation, we explore the ...
    • Tasking in accelerators: performance evaluation 

      Toledo, Leonel; Peña, Antonio J.; Catalán, Sandra; Valero-Lara, Pedro (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Text en actes de congrés
      Accés obert
      In this work, we analyze the implications and results of implementing dynamic parallelism, concurrent kernels and CUDA Graphs to solve task-oriented problems. As a benchmark we propose three different methods for solving ...