Ara es mostren els items 84-103 de 140

    • O(n) key–value sort with active compute memory 

      Esmaili Dokht, Pouya; Guiot Cusido, Miquel; Radojkovic, Petar; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Adlard, Jason; Amato, Paolo; Sforzin, Marco (Institute of Electrical and Electronics Engineers (IEEE), 2024-05)
      Article
      Accés obert
      We propose the Active Compute Memory (ACM), a near-memory-processing architecture capable of performing key–value sort directly in the DRAM. In the ACM architecture, sort is merely the writing of data into memory with one ...
    • OmpSs@cloudFPGA: An FPGA task-based programming model with message passing 

      Haro Ruiz, Juan Miguel de; Cano, Rubén; Álvarez Martínez, Carlos; Jiménez González, Daniel; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Abel, François; Ringlein, Burkhard; Weiss, Beat (Institute of Electrical and Electronics Engineers (IEEE), 2022)
      Text en actes de congrés
      Accés obert
      Nowadays, a new parallel paradigm for energy-efficient heterogeneous hardware infrastructures is required to achieve better performance at a reasonable cost on high-performance computing applications. Under this new paradigm, ...
    • OmpSs@FPGA framework for high performance FPGA computing 

      Haro Ruiz, Juan Miguel de; Bosch Pons, Jaume; Filgueras Izquierdo, Antonio; Vidal, Miquel; Jiménez González, Daniel; Álvarez Martínez, Carlos; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2021-12-01)
      Article
      Accés obert
      This paper presents the new features of the OmpSs@FPGA framework. OmpSs is a data-flow programming model that supports task nesting and dependencies to target asynchronous parallelism and heterogeneity. OmpSs@FPGA is the ...
    • OmpSs@Zynq All-Programmable SoC Ecosystem 

      Filgueras Izquierdo, Antonio; Gil Blasco, Eduard; Jiménez González, Daniel; Álvarez Martínez, Carlos; Martorell Bofill, Xavier; Langer, Jan; Noguera Serra, Juan José; Vissers, Kees (Association for Computing Machinery (ACM), 2014)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      OmpSs is an OpenMP-like directive-based programming model that includes heterogeneous execution (MIC, GPU, SMP, etc.) and runtime task dependencies management. Indeed, OmpSs has largely influenced the recently appeared ...
    • On the instrumentation of OpenMP and OmpSs Tasking constructs 

      Servat, Harald; Teruel, Xavier; Llort Sánchez, Germán; Duran González, Alejandro; Giménez, J.; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2012)
      Text en actes de congrés
      Accés obert
      Parallelism has become more and more commonplace with the advent of the multicore processors. Although different parallel pro- gramming models have arisen to exploit the computing capabilities of such processors, ...
    • OpenMP extensions for FPGA Accelerators 

      Cabrera, Daniel; Martorell Bofill, Xavier; Gaydadjiev, Georgi; Ayguadé Parra, Eduard; Jiménez González, Daniel (2009-07)
      Text en actes de congrés
      Accés obert
      Reconfigurable computing is one of the paths to explore towards low-power supercomputing. However, programming these reconfigurable devices is not an easy task and still requires significant research and development efforts ...
    • OpenMP tasking analysis for programers 

      Teruel, Xavier; Barton, Christopher; Duran González, Alejandro; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Unnikrishnan, Priya; Zhang, Guansong; Silvera, Raul (2009-11)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      As of 2008, the OpenMP 3.0 standard includes task support allowing programmers to exploit irregular parallelism. Although several compilers are providing support for this new feature there has not been extensive investigation ...
    • OpenMP tasks in IBM XL compilers 

      Teruel, Xavier; Unnikrishnan, Priya; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Silvera, Raul; Zhang, Guansong; Tiotto, Ettore (Association for Computing Machinery (ACM), 2008)
      Comunicació de congrés
      Accés obert
      Tasking is the most significant feature included in the new OpenMP 3.0 standard. It was introduced to handle unstructured parallelism and broaden the range of applications that can be parallelized by OpenMP. This paper ...
    • Optimizing NANOS OpenMP for the IBM Cyclops multithreaded architecture 

      Ródenas Picó, David; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Almási, George; Cascaval, Calin; Castaños, José G.; Moreira, Jose E. (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Text en actes de congrés
      Accés obert
      In this paper, we present two approaches to improve the execution of OpenMP applications on the IBM Cyclops multithreaded architecture. Both solutions are independent and they are focused to obtain better performance through ...
    • Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL 

      Ferrer, Roger; Planas Carbonell, Judit; Bellens, Pieter; Duran González, Alejandro; González Tallada, Marc; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2011)
      Article
      Accés restringit per política de l'editorial
      In this paper, we present OMPSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on three different architectures, SMP, Cell/B.E. and ...
    • Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL 

      Ferrer, Roger; Planas Carbonell, Judit; Bellens, Pieter; Duran Gonzalez, Alejandro; González Tallada, Marc; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2010)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      In this paper, we present OMPSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on three different architectures, SMP, Cell/B.E. and ...
    • Parallel programming issues and what the compiler can do to help 

      Royuela, Sara; Martorell Bofill, Xavier (Barcelona Supercomputing Center, 2015-05-05)
      Text en actes de congrés
      Accés obert
      Twenty-first century parallel programming models are becoming real complex due to the diversity of architectures they need to target (Multi- and Many-cores, GPUs, FPGAs, etc.). What if we could use one programming model ...
    • Parallelware tools: an experimental evaluation on POWER systems 

      Arenaz Silva, Manuel; Martorell Bofill, Xavier (Springer, 2019)
      Text en actes de congrés
      Accés obert
      Static code analysis tools are designed to aid software developers to build better quality software in less time, by detecting defects early in the software development life cycle. Even the most experienced developer ...
    • Particle-in-cell simulation using asynchronous tasking 

      Guidotti, Nicolas; Ceyrat, Pedro; Barreto, João; Monteiro, José; Rodrigues, Rodrigo; Fonseca, Ricardo; Martorell Bofill, Xavier; Peña Monferrer, Antonio José (Springer Nature, 2021)
      Text en actes de congrés
      Accés obert
      Recently, task-based programming models have emerged as a prominent alternative among shared-memory parallel programming paradigms. Inherently asynchronous, these models provide native support for dynamic load balancing ...
    • Performance and energy effects on task-based parallelized applications: User-directed versus manual vectorization 

      Caminal Pallarés, Helena; Caballero de Gea, Diego; Cebrián González, Juan Manuel; Ferrer, Roger; Casas, Marc; Moretó Planas, Miquel; Martorell Bofill, Xavier; Valero Cortés, Mateo (2018-06)
      Article
      Accés obert
      Heterogeneity, parallelization and vectorization are key techniques to improve the performance and energy efficiency of modern computing systems. However, programming and maintaining code for these architectures poses a ...
    • Preliminary work on a mechanism for testing a customized architecture 

      González, Cecilia; Jiménez González, Daniel; Martorell Bofill, Xavier; Álvarez Martínez, Carlos; Gaydadjiev, Georgi (2009-07)
      Text en actes de congrés
      Accés obert
      Hardware customization for scientific applications has shown a big potential for reducing power consumption and increasing performance. In particular, the automatic generation of ISA extensions for General-Purpose Processors ...
    • Productive cluster programming with OmpSs 

      Bueno Hedo, Javier; Martinell Andreu, Luis; Duran González, Alejandro; Farreras Esclusa, Montserrat; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2011)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the productivity of MPI programmers is low because of the complexity of expressing parallelism and communication, and the difficulty ...
    • Productive programming of GPU clusters with OmpSs 

      Bueno Hedo, Javier; Planas, Judit; Duran González, Alejandro; Badia Sala, Rosa Maria; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2012)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present ...
    • Real-time GPU-based face detection in HD video sequences 

      Oro, David; Fernández, Carles; Rodriguez Saeta, Javier; Martorell Bofill, Xavier; Hernando Pericás, Francisco Javier (2011)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      Modern GPUs have evolved into fully programmable parallel stream multiprocessors. Due to the nature of the graphic workloads, computer vision algorithms are in good position to leverage the computing power of these ...
    • Reducing compiler-inserted instrumentation in unified-parallel-C code generation 

      Alvanos, Michail; Amaral, José Nelson; Tiotto, Ettore; Farreras Esclusa, Montserrat; Martorell Bofill, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Programs written in Partitioned Global Address Space (PGAS) languages can access any location of the entire address space via standard read/write operations. However, the compiler have to create the communication mechanisms ...