Now showing items 1-20 of 187

    • A case for user-level dynamic page migration 

      Nikolopoulos, Dimitrios; Papatheodorou, Theodore; Polychronopoulos, Constantine D.; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2000)
      Conference report
      Open Access
      This paper presents user-level dynamic page migration, a runtime technique which transparently enables parallel programs to tune their memory performance on distributed shared memory multiprocessors, with feedback obtained ...
    • A dependency-aware task-based programming environment for multi-core architectures 

      Pérez Cáncer, Josep Maria; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (2008)
      Conference report
      Open Access
      Parallel programming on SMP and multi-core architectures is hard. In this paper we present a programming model for those environments based on automatic function level parallelism that strives to be easy, flexible, portable, ...
    • A dynamic periodicity detector: application to speedup computation 

      Freitag, Fèlix; Corbalán González, Julita; Labarta Mancho, Jesús José (IEEE, 2001-04)
      Conference report
      Open Access
      We propose a dynamic periodicity detector (DPD) for the estimation of periodicities in data series obtained from the execution of applications. We analyze the algorithm used by the periodicity detector and its performance ...
    • A framework for integrating data alignment, distribution, and redistribution in distributed memory multiprocessors 

      García Almiñana, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2001-04)
      Article
      Restricted access - publisher's policy
      Parallel architectures with physically distributed memory provide a cost-effective scalability to solve many large scale scientific problems. However, these systems are very difficult to program and tune. In these systems, ...
    • A high-productivity task-based programming model for clusters 

      Tejedor Saavedra, Enric; Farreras Esclusa, Montserrat; Grove, David; Badia Sala, Rosa Maria; Almasi, Gheorghe; Labarta Mancho, Jesús José (2012-12-15)
      Article
      Restricted access - publisher's policy
      Programming for large-scale, multicore-based architectures requires adequate tools that offer ease of programming and do not hinder application performance. StarSs is a family of parallel programming models based on automatic ...
    • A library implementation of the nano-threads programming model 

      Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Navarro, Nacho; Ayguadé Parra, Eduard (Springer, 1996)
      Conference report
      Open Access
      In this paper we describe the design and implementation of a user-level thread package based on the nano-threads programming model, whose goal is to efficiently manage the application parallelism at user-level. Nano-thread ...
    • A proposal for error handling in OpenMP 

      Duran González, Alejandro; Ferrer, Roger; Costa Prats, Juan José; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2008)
      Article
      Restricted access - publisher's policy
      OpenMP has been focused in performance applied to numerical applications, but when we try to move this focus to other kind of applications, like Web servers, we detect one important lack. In these applications, performance ...
    • A proposal to extend the OpenMP tasking model with dependent tasks 

      Duran Gonzalez, Alejandro; Ferrer, Roger; Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (2009)
      Article
      Restricted access - publisher's policy
      Tasking in OpenMP 3.0 has been conceived to handle the dynamic generation of unstructured parallelism. New directives have been added allowing the user to identify units of independent work (tasks) and to define points to ...
    • A runtime heuristic to selectively replicate tasks for application-specific reliability targets 

      Subasi, Omer; Yalcin, Gulay; Zyulkyarov, Ferad; Unsal, Osman Sabri; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Conference report
      Open Access
      In this paper we propose a runtime-based selective task replication technique for task-parallel high performance computing applications. Our selective task replication technique is automatic and does not require ...
    • A simulation framework to automatically analyze the communication-computation overlap in scientific applications 

      Subotic, Vladimir; Sancho, Jose Carlos; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2010)
      Conference report
      Open Access
      Overlapping communication and computation has been devised as an attractive technique to alleviate the huge application's network requirements at large scale. Overlapping will allow to fully or partially hide the long ...
    • A trace-scaling agent for parallel application tracing. 

      Freitag, Fèlix; Caubet Serrabou, Jordi; Labarta Mancho, Jesús José (IEEE, 2002)
      Conference report
      Open Access
      Tracing and performance analysis tools are an important component in the development of high performance applications. Tracing parallel programs with current tracing tools, however, easily leads to large trace files with ...
    • A transparent runtime data distribution engine for OpenMP 

      Nikolopoulos, Dimitrios; Papatheodorou, Theodore; Polychronopoulos, C D; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (2001-07)
      Article
      Restricted access - publisher's policy
      This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the ...
    • A visual embedding for the unsupervised extraction of abstract semantics 

      García Gasulla, Dario; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Béjar Alonso, Javier; Cortés García, Claudio Ulises; Suzumura, Toyotaro; Chen, R (2017-05-01)
      Article
      Open Access
      Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector ...
    • Align and distribute-based linear loop transformations 

      Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Springer, 1993)
      Conference report
      Open Access
      In this paper we generalize the framework of linear loop transformations in the sense that loop alignment is considered as a new component in the transformation process. The aim is to match the structure of loop nests with ...
    • ALOJA: a systematic study of Hadoop deployment variables to enable automated characterization of cost-effectiveness 

      Poggi Mastrokalo, Nicolas; Carrera Pérez, David; Call, Aaron; Mendoza, Sergio; Becerra Fontal, Yolanda; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Gagliardi, Fabrizio; Labarta Mancho, Jesús José; Reinauer, Rob; Vujic, Nikola; Green, Daron; Blakeley, Jose (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Open Access
      This article presents the ALOJA project, an initiative to produce mechanisms for an automated characterization of cost-effectiveness of Hadoop deployments and reports its initial results. ALOJA is the latest phase of a ...
    • AMA: asynchronous management of accelerators for task-based programming models 

      Planas, Judit; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Elsevier, 2015)
      Conference report
      Open Access
      Computational science has benefited in the last years from emerging accelerators that increase the performance of scientific simulations, but using these devices hinders the programming task. This paper presents AMA: a set ...
    • An out-of-the-box full-network embedding for convolutional neural networks 

      Garcia-Gasulla, Dario; Vilalta Arias, Armand; Parés, Ferran; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Cortés García, Claudio Ulises; Suzumura, Toyotaro (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Conference report
      Open Access
      Features extracted through transfer learning can be used to exploit deep learning representations in contexts where there are very few training samples, where there are limited computational resources, or when the tuning ...
    • Analysis and simulation of multiplexed single-bus networks with and without buffering 

      Llaberia Griñó, José M.; Valero Cortés, Mateo; Herrada Lillo, Enrique; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 1985)
      Conference report
      Open Access
      Performance issues of a single-bus interconnection network for multiprocessor systems, operating in a multiplexed way, are presented in this paper. Several models are developed and used to allow system performance evaluation. ...
    • Application Acceleration on FPGAs with OmpSs@FPGA 

      Bosch, Jaume; Tan, Xubin; Filgueras Izquierdo, Antonio; Vidal-Piñol, Miquel; Mateu, Marc; Jiménez-González, Daniel; Álvarez, Carlos; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference report
      Open Access
      OmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous ...
    • Applying interposition techniques for performance analysis of OPENMP parallel applications 

      González Tallada, Marc; Serra, Albert; Martorell Bofill, Xavier; Oliver Segura, José; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Navarro, Nacho (Institute of Electrical and Electronics Engineers (IEEE), 2000)
      Conference report
      Open Access
      Tuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may arise. We believe that ...