Now showing items 1-20 of 218

    • A case for user-level dynamic page migration 

      Nikolopoulos, Dimitrios; Papatheodorou, Theodore; Polychronopoulos, Constantine D.; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2000)
      Conference report
      Open Access
      This paper presents user-level dynamic page migration, a runtime technique which transparently enables parallel programs to tune their memory performance on distributed shared memory multiprocessors, with feedback obtained ...
    • A dependency-aware task-based programming environment for multi-core architectures 

      Pérez Cáncer, Josep Maria; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (2008)
      Conference report
      Open Access
      Parallel programming on SMP and multi-core architectures is hard. In this paper we present a programming model for those environments based on automatic function level parallelism that strives to be easy, flexible, portable, ...
    • A dynamic periodicity detector: application to speedup computation 

      Freitag, Fèlix; Corbalán González, Julita; Labarta Mancho, Jesús José (IEEE, 2001-04)
      Conference report
      Open Access
      We propose a dynamic periodicity detector (DPD) for the estimation of periodicities in data series obtained from the execution of applications. We analyze the algorithm used by the periodicity detector and its performance ...
    • A framework for integrating data alignment, distribution, and redistribution in distributed memory multiprocessors 

      García Almiñana, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2001-04)
      Article
      Restricted access - publisher's policy
      Parallel architectures with physically distributed memory provide a cost-effective scalability to solve many large scale scientific problems. However, these systems are very difficult to program and tune. In these systems, ...
    • A high-productivity task-based programming model for clusters 

      Tejedor Saavedra, Enric; Farreras Esclusa, Montserrat; Grove, David; Badia Sala, Rosa Maria; Almasi, Gheorghe; Labarta Mancho, Jesús José (2012-12-15)
      Article
      Restricted access - publisher's policy
      Programming for large-scale, multicore-based architectures requires adequate tools that offer ease of programming and do not hinder application performance. StarSs is a family of parallel programming models based on automatic ...
    • A library implementation of the nano-threads programming model 

      Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Navarro, Nacho; Ayguadé Parra, Eduard (Springer, 1996)
      Conference report
      Open Access
      In this paper we describe the design and implementation of a user-level thread package based on the nano-threads programming model, whose goal is to efficiently manage the application parallelism at user-level. Nano-thread ...
    • A proposal for error handling in OpenMP 

      Duran González, Alejandro; Ferrer, Roger; Costa Prats, Juan José; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2008)
      Article
      Restricted access - publisher's policy
      OpenMP has been focused in performance applied to numerical applications, but when we try to move this focus to other kind of applications, like Web servers, we detect one important lack. In these applications, performance ...
    • A proposal to extend the OpenMP tasking model with dependent tasks 

      Duran Gonzalez, Alejandro; Ferrer, Roger; Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (2009)
      Article
      Restricted access - publisher's policy
      Tasking in OpenMP 3.0 has been conceived to handle the dynamic generation of unstructured parallelism. New directives have been added allowing the user to identify units of independent work (tasks) and to define points to ...
    • A runtime heuristic to selectively replicate tasks for application-specific reliability targets 

      Subasi, Omer; Yalcin, Gulay; Zyulkyarov, Ferad; Unsal, Osman Sabri; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Conference report
      Open Access
      In this paper we propose a runtime-based selective task replication technique for task-parallel high performance computing applications. Our selective task replication technique is automatic and does not require ...
    • A simulation framework to automatically analyze the communication-computation overlap in scientific applications 

      Subotic, Vladimir; Sancho, Jose Carlos; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2010)
      Conference report
      Open Access
      Overlapping communication and computation has been devised as an attractive technique to alleviate the huge application's network requirements at large scale. Overlapping will allow to fully or partially hide the long ...
    • A trace-scaling agent for parallel application tracing 

      Freitag, Fèlix; Caubet Serrabou, Jordi; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2002)
      Conference report
      Open Access
      Tracing and performance analysis tools are an important component in the development of high performance applications. Tracing parallel programs with current tracing tools, however, easily leads to large trace files with ...
    • A trace-scaling agent for parallel application tracing. 

      Freitag, Fèlix; Caubet Serrabou, Jordi; Labarta Mancho, Jesús José (IEEE, 2002)
      Conference report
      Open Access
      Tracing and performance analysis tools are an important component in the development of high performance applications. Tracing parallel programs with current tracing tools, however, easily leads to large trace files with ...
    • A transparent runtime data distribution engine for OpenMP 

      Nikolopoulos, Dimitrios; Papatheodorou, Theodore; Polychronopoulos, C D; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (2001-07)
      Article
      Restricted access - publisher's policy
      This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the ...
    • A visual embedding for the unsupervised extraction of abstract semantics 

      Garcia Gasulla, Dario; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Béjar Alonso, Javier; Cortés García, Claudio Ulises; Suzumura, Toyotaro; Chen, R (2017-05-01)
      Article
      Open Access
      Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector ...
    • Accelerating FFT using NEC SX-Aurora vector engine 

      Vizcaíno Serrano, Pablo; Mantovani, Filippo; Labarta Mancho, Jesús José (Springer, 2021)
      Conference report
      Open Access
      Novel architectures leveraging long and variable vector lengths like the NEC SX-Aurora or the vector extension of RISCV are appearing as promising solutions on the supercomputing market. These architectures often require ...
    • Acceleration with long vector architectures: Implementation and evaluation of the FFT kernel on NEC SX-Aurora and RISC-V vector extension 

      Vizcaíno Serrano, Pablo; Mantovani, Filippo; Ferrer Ibañez, Roger; Labarta Mancho, Jesús José (Wiley (John Wiley & Sons), 2023-09-10)
      Article
      Open Access
      Novel architectures leveraging long and variable vector lengths like the NEC SX-Aurora or the vector extension of RISCV are appearing as promising solutions on the supercomputing market. These architectures often require ...
    • Align and distribute-based linear loop transformations 

      Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Springer, 1993)
      Conference report
      Open Access
      In this paper we generalize the framework of linear loop transformations in the sense that loop alignment is considered as a new component in the transformation process. The aim is to match the structure of loop nests with ...
    • ALOJA: a systematic study of Hadoop deployment variables to enable automated characterization of cost-effectiveness 

      Poggi Mastrokalo, Nicolas; Carrera Pérez, David; Call, Aaron; Mendoza, Sergio; Becerra Fontal, Yolanda; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Gagliardi, Fabrizio; Labarta Mancho, Jesús José; Reinauer, Rob; Vujic, Nikola; Green, Daron; Blakeley, Jose (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Open Access
      This article presents the ALOJA project, an initiative to produce mechanisms for an automated characterization of cost-effectiveness of Hadoop deployments and reports its initial results. ALOJA is the latest phase of a ...
    • AMA: asynchronous management of accelerators for task-based programming models 

      Planas, Judit; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Elsevier, 2015)
      Conference report
      Open Access
      Computational science has benefited in the last years from emerging accelerators that increase the performance of scientific simulations, but using these devices hinders the programming task. This paper presents AMA: a set ...
    • An extension of the StarSs programming model for platforms with multiple GPUs 

      Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Igual Peña, Francisco D.; Labarta Mancho, Jesús José; Mayo Gual, Rafael; Quintana Ortí, Enrique Salvador (Springer, 2009)
      Conference lecture
      Open Access
      While general-purpose homogeneous multi-core architectures are becoming ubiquitous, there are clear indications that, for a number of important applications, a better performance/power ratio can be attained using specialized ...