Ara es mostren els items 137-156 de 217

    • Parallelizing dense and banded linear algebra libraries using SMPSs 

      Badia Sala, Rosa Maria; Herrero Zaragoza, José Ramón; Labarta Mancho, Jesús José; Pérez Cáncer, Josep Maria; Quintana Ortí, Enrique Salvador; Quintana Ortí, Gregorio (2009-12-25)
      Article
      Accés restringit per política de l'editorial
      The promise of future many-core processors, with hundreds of threads running concurrently, has led the developers of linear algebra libraries to rethink their design in order to extract more parallelism, further exploit ...
    • ParaView + Alya + D8tree: Integrating high performance computing and high performance data analytics 

      Artigues, Antoni; Cugnasco, Cesare; Becerra Fontal, Yolanda; Cucchietti, Fernando; Houzeaux, Guillaume; Vázquez, Mariano; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Elsevier, 2017)
      Article
      Accés obert
      Large scale time-dependent particle simulations can generate massive amounts of data, making it so that storing the results is often the slowest phase and the primary time bottleneck of the simulation. Furthermore, analysing ...
    • PARSECSs: Evaluating the impact of task parallelism in the PARSEC benchmark suite 

      Chasapis, Dimitrios; Casas, Marc; Moretó Planas, Miquel; Vidal Ortiz, Raul; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2015-12-01)
      Article
      Accés obert
      In this work, we show how parallel applications can be implemented efficiently using task parallelism. We also evaluate the benefits of such parallel paradigm with respect to other approaches. We use the PARSEC benchmark ...
    • Performance analysis and optimization of the FFTXlib on the Intel knights landing architecture 

      Wagner, Michael; López, Victor; Morillo, Julian; Cavazzoni, Carlo; Affinito, Fabio; Gimenez, Judit; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés obert
      In this paper, we address the decreasing performance of the FFTXlib, the Fast Fourier Transformation (FFT) kernel of Quantum ESPRESSO, when scaling to a full KNL node. An increased performance in the FFTXlib will likewise ...
    • Performance analysis of parallel Python applications 

      Wagner, Michael; Llort Sánchez, Germán Matias; Mercadal Melia, Estanislao; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Elsevier, 2017)
      Text en actes de congrés
      Accés obert
      Python is progressively consolidating itself within the HPC community with its simple syntax, large standard library, and powerful third-party libraries for scientific computing that are especially attractive to domain ...
    • Performance data extrapolation in parallel codes 

      González García, Juan; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2010)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Measuring the performance of parallel codes is a compromise between lots of factors. The most important one is which data has to be analyzed. Current supercomputers are able to run applications in large number of processors ...
    • Performance impact of the interconnection network on MareNostrum applications 

      Ramírez Bellido, Alejandro; Prat, Oriol; Labarta Mancho, Jesús José; Valero Cortés, Mateo (-, 2007)
      Text en actes de congrés
      Accés obert
      Interconnection networks are one of the fundamental components of a supercomputing facility, and one of the most expensive parts. They represent one of the main differences between two supercomputers built from the same ...
    • POSTER: collective dynamic parallelism for directive based GPU programming languages and compilers 

      Ozen, Guray; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2016)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Early programs for GPU (Graphics Processing Units) acceleration were based on a flat, bulk parallel programming model, in which programs had to perform a sequence of kernel launches from the host CPU. In the latest releases ...
    • POSTER: Exploiting asymmetric multi-core processors with flexible system sofware 

      Chronaki, Kallia; Moretó Planas, Miquel; Casas, Marc; Rico, Alejandro; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2016)
      Comunicació de congrés
      Accés obert
      Energy efficiency has become the main challenge for high performance computing (HPC). The use of mobile asymmetric multi-core architectures to build future multi-core systems is an approach towards energy savings while ...
    • Power-aware load balancing of large scale MPI applications 

      Etinski, Maja; Corbalán González, Julita; Labarta Mancho, Jesús José; Valero Cortés, Mateo; Veidenbaum, Alex (Institute of Electrical and Electronics Engineers (IEEE), 2009)
      Text en actes de congrés
      Accés obert
      Power consumption is a very important issue for HPC community, both at the level of one application or at the level of whole workload. Load imbalance of a MPI application can be exploited to save CPU energy without penalizing ...
    • Predicting MPI buffer addresses. 

      Freitag, Fèlix; Farreras Esclusa, Montserrat; Cortés, Toni; Labarta Mancho, Jesús José (Springer, 2004)
      Comunicació de congrés
      Accés obert
      Communication latencies have been identified as one of the performance limiting factors of message passing applications in clusters of workstations/multiprocessors. On the receiver side, message-copying operations contribute ...
    • Productive cluster programming with OmpSs 

      Bueno Hedo, Javier; Martinell Andreu, Luis; Duran González, Alejandro; Farreras Esclusa, Montserrat; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2011)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the productivity of MPI programmers is low because of the complexity of expressing parallelism and communication, and the difficulty ...
    • Productive programming of GPU clusters with OmpSs 

      Bueno Hedo, Javier; Planas, Judit; Duran González, Alejandro; Badia Sala, Rosa Maria; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2012)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present ...
    • Programmable and scalable reductions on clusters 

      Ciesko, Jan; Bueno Hedo, Javier; Puzovic, Nikola; Ramírez Bellido, Alejandro; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Reductions matter and they are here to stay. Wide adoption of parallel processing hardware in a broad range of computer applications has encouraged recent research efforts on their efficient parallelization. Furthermore, ...
    • Programmer-directed partial redundancy for resilient HPC 

      Subasi, Omer; Arias Moreno, Francisco Javier; Unsal, Osman Sabri; Labarta Mancho, Jesús José; Cristal Kestelman, Adrián (Association for Computing Machinery (ACM), 2015)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      In this work we propose partial task replication and check-pointing for task-parallel HPC applications to mitigate silent data corruption (SDC) errors. As the complete replication of all application tasks can be prohibitive ...
    • PyCOMPSs: Parallel computational workflows in Python 

      Tejedor, Enric; Becerra Fontal, Yolanda; Alomar, Guillem; Queralt Calafat, Anna; Badia Sala, Rosa Maria; Torres Viñals, Jordi; Cortés, Toni; Labarta Mancho, Jesús José (2017-01-01)
      Article
      Accés obert
      The use of the Python programming language for scientific computing has been gaining momentum in the last years. The fact that it is compact and readable and its complete set of scientific libraries are two important ...
    • Random forest as a tumour genetic marker extractor 

      Pérez Arnal, Raquel Leandra; Garcia Gasulla, Dario; Torrents Rodas, David; Pares, Ferran; Cortés García, Claudio Ulises; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (IOS Press, 2019)
      Text en actes de congrés
      Accés obert
      Finding tumour genetic markers is essential to biomedicine due to their relevance for cancer detection and therapy development. In this paper, we explore a recently released dataset of chromosome rearrangements in 2,586 ...
    • Reducción de la degradación y el conflicto en las redes de interconexión para sistemas multiprocesadores 

      Llaberia Griñó, José M.; Labarta Mancho, Jesús José; Herrada Lillo, Enrique; Valero Cortés, Mateo (Asociación Española de Informática y Automática, 1985)
      Text en actes de congrés
      Accés obert
      Uno de los parámetros causante de una disminuación potencial de la eficiencia de un sistema multiprocesador es el tiempo de respuesta del subsistemas de memoria. En este trabajo se presentan diversas técnicas que mejoran ...
    • Reducing cache coherence traffic with hierarchical directory cache and NUMA-aware runtime scheduling 

      Caheny, Paul; Casas, Marc; Moretó Planas, Miquel; Gloaguen, Hervé; Saintes, Maxime; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Text en actes de congrés
      Accés obert
      Cache Coherent NUMA (ccNUMA) architectures are a widespread paradigm due to the benefits they provide for scaling core count and memory capacity. Also, the flat memory address space they offer considerably improves ...
    • Reducing data movement on large shared memory systems by exploiting computation dependencies 

      Barrera, I.S.; Ayguadé Parra, Eduard; Valero Cortés, Mateo; Moretó Planas, Miquel; Labarta Mancho, Jesús José; Casas, Marc (Association for Computing Machinery (ACM), 2018)
      Text en actes de congrés
      Accés obert
      Shared memory systems are becoming increasingly complex as they typically integrate several storage devices. That brings different access latencies or bandwidth rates depending on the proximity between the cores where ...