Ara es mostren els items 189-208 de 326

    • NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP 

      González Tallada, Marc; Ayguadé Parra, Eduard; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Navarro, Nacho; Oliver Segura, José (2000-10)
      Article
      Accés restringit per política de l'editorial
      This paper describes the support provided by the NanosCompiler to nested parallelism in OpenMP. The NanosCompiler is a source-to-source parallelizing compiler implemented around a hierarchical internal program representation ...
    • Nebelung: execution environment for transactional OpenMP 

      Milovanovic, M; Ferrer, Roger; Gajinov, Vladimir; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (2008-06)
      Article
      Accés restringit per política de l'editorial
      Future generations of Chip Multiprocessors (CMP) will provide dozens or even hundreds of cores inside the chip. Writing applications that benefit from the massive computational power offered by these chips is not going to ...
    • New OpenMP directives for irregular data access loops 

      Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Oliver Segura, José; Henty, David (2002-03)
      Article
      Accés restringit per política de l'editorial
      Many scientific applications involve array operations that are sparse in nature, ie array elements depend on the values of relatively few elements of the same or another array. When parallelised in the shared-memory model, ...
    • Node architecture implications for in-memory data analytics on scale-in clusters 

      Awan, Ashan Javed; Vlassov, Vladimir; Brorsson, Mats; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2016)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics. Recent studies propose scale-in clusters ...
    • Non-consistent dual register files to reduce register pressure 

      Llosa Espuny, José Francisco; Valero Cortés, Mateo; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 1995)
      Text en actes de congrés
      Accés obert
      The continuous grow on instruction level parallelism offered by microprocessors requires a large register file and a large number of ports to access it. This paper presents the non-consistent dual register file, an alternative ...
    • O(n) key–value sort with active compute memory 

      Esmaili Dokht, Pouya; Guiot Cusido, Miquel; Radojkovic, Petar; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Adlard, Jason; Amato, Paolo; Sforzin, Marco (Institute of Electrical and Electronics Engineers (IEEE), 2024-05)
      Article
      Accés obert
      We propose the Active Compute Memory (ACM), a near-memory-processing architecture capable of performing key–value sort directly in the DRAM. In the ACM architecture, sort is merely the writing of data into memory with one ...
    • Obtaining synchronization-free code with maximum parallelism 

      Gavaldà Mestre, Ricard; Ayguadé Parra, Eduard; Torres Viñals, Jordi (1996-03)
      Report de recerca
      Accés obert
      This paper addresses the problem of extracting the maximum synchronization-free parallelism that may be present in loops. In order to reduce communication and synchronization overheads, some parallelizing compilers ...
    • OmpSs-2@Cluster: Distributed memory execution of nested OpenMP-style tasks 

      Aguilar Mena, Jimmy; Ali, Omar Shaaban Ibrahim; Beltran Querol, Vicenç; Carpenter, Paul Matthew; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer Nature, 2022)
      Text en actes de congrés
      Accés obert
      State-of-the-art programming approaches generally have a strict division between intra-node shared memory parallelism and inter-node MPI communication. Tasking with dependencies offers a clean, dependable abstraction for ...
    • OmpSs-OpenCL programming model for heterogeneous systems 

      Elangovan, Vinoth Krishnan; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard (Springer, 2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The advent of heterogeneous computing has forced programmers to use platform specific programming paradigms in order to achieve maximum performance. This approach has a steep learning curve for programmers and also has ...
    • OmpSs@cloudFPGA: An FPGA task-based programming model with message passing 

      Haro Ruiz, Juan Miguel de; Cano, Rubén; Álvarez Martínez, Carlos; Jiménez González, Daniel; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Abel, François; Ringlein, Burkhard; Weiss, Beat (Institute of Electrical and Electronics Engineers (IEEE), 2022)
      Text en actes de congrés
      Accés obert
      Nowadays, a new parallel paradigm for energy-efficient heterogeneous hardware infrastructures is required to achieve better performance at a reasonable cost on high-performance computing applications. Under this new paradigm, ...
    • OmpSs@FPGA framework for high performance FPGA computing 

      Haro Ruiz, Juan Miguel de; Bosch Pons, Jaume; Filgueras Izquierdo, Antonio; Vidal, Miquel; Jiménez González, Daniel; Álvarez Martínez, Carlos; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2021-12-01)
      Article
      Accés obert
      This paper presents the new features of the OmpSs@FPGA framework. OmpSs is a data-flow programming model that supports task nesting and dependencies to target asynchronous parallelism and heterogeneity. OmpSs@FPGA is the ...
    • On automatic loop data-mapping for distributed-memory multiprocessors 

      Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Llaberia Griñó, José M.; Valero Cortés, Mateo (Springer, 1991)
      Text en actes de congrés
      Accés obert
      In this paper we present a unified approach for compiling programs for Distributed-Memory Multiprocessors (DMM). Parallelization of sequential programs for DMM is much more difficult to achieve than for shared memory systems ...
    • On the behavior of convolutional nets for feature extraction 

      Garcia-Gasulla, Dario; Parés Pont, Ferran; Vilalta Arias, Armand; Moreno, Jonatan; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Cortés García, Claudio Ulises; Suzumura, Toyotaro (2018-03)
      Article
      Accés obert
      Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive ...
    • On the instrumentation of OpenMP and OmpSs Tasking constructs 

      Servat, Harald; Teruel, Xavier; Llort Sánchez, Germán; Duran González, Alejandro; Giménez, J.; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2012)
      Text en actes de congrés
      Accés obert
      Parallelism has become more and more commonplace with the advent of the multicore processors. Although different parallel pro- gramming models have arisen to exploit the computing capabilities of such processors, ...
    • On the maturity of parallel applications for asymmetric multi-core processors 

      Chronaki, Kallia; Moretó Planas, Miquel; Casas, Marc; Rico, Alejandro; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Elsevier, 2019-05-01)
      Article
      Accés obert
      Asymmetric multi-cores (AMCs) are a successful architectural solution for both mobile devices and supercomputers. By maintaining two types of cores (fast and slow) AMCs are able to provide high performance under the facility ...
    • On the representativeness of convolutional neural networks layers 

      García Gasulla, Darío; Moreno, Jonatan; Ramos-Pollan, Raúl; Casadiegos Barrios, Romel; Béjar Alonso, Javier; Cortés García, Claudio Ulises; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Suzumura, Toyotaro (IOS PRESS EBOOKS, 2016)
      Capítol de llibre
      Accés obert
      Convolutional Neural Networks (CNN) are the most popular of deep network models due to their applicability and success in image processing. Although plenty of effort has been made in designing and training better discriminative ...
    • On the roles of the programmer, the compiler and the runtime system when programming accelerators in OpenMP 

      Ozen, Guray; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2014)
      Text en actes de congrés
      Accés obert
      OpenMP includes in its latest 4.0 specification the accelerator model. In this paper we present a partial implementation of this specification in the OmpSs programming model developed at the Barcelona Supercomputing Center ...
    • OpenMP extensions for FPGA Accelerators 

      Cabrera, Daniel; Martorell Bofill, Xavier; Gaydadjiev, Georgi; Ayguadé Parra, Eduard; Jiménez González, Daniel (2009-07)
      Text en actes de congrés
      Accés obert
      Reconfigurable computing is one of the paths to explore towards low-power supercomputing. However, programming these reconfigurable devices is not an easy task and still requires significant research and development efforts ...
    • OpenMP tasking analysis for programers 

      Teruel, Xavier; Barton, Christopher; Duran González, Alejandro; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Unnikrishnan, Priya; Zhang, Guansong; Silvera, Raul (2009-11)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      As of 2008, the OpenMP 3.0 standard includes task support allowing programmers to exploit irregular parallelism. Although several compilers are providing support for this new feature there has not been extensive investigation ...
    • OpenMP tasks in IBM XL compilers 

      Teruel, Xavier; Unnikrishnan, Priya; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Silvera, Raul; Zhang, Guansong; Tiotto, Ettore (Association for Computing Machinery (ACM), 2008)
      Comunicació de congrés
      Accés obert
      Tasking is the most significant feature included in the new OpenMP 3.0 standard. It was introduced to handle unstructured parallelism and broaden the range of applications that can be parallelized by OpenMP. This paper ...