Exploració per autor "Labarta Mancho, Jesús José"

Parallelizing dense and banded linear algebra libraries using SMPSs

Badia Sala, Rosa Maria; Herrero Zaragoza, José Ramón; Labarta Mancho, Jesús José; Pérez Cáncer, Josep Maria; Quintana Ortí, Enrique Salvador; Quintana Ortí, Gregorio (2009-12-25)
Article
Accés restringit per política de l'editorial

The promise of future many-core processors, with hundreds of threads running concurrently, has led the developers of linear algebra libraries to rethink their design in order to extract more parallelism, further exploit ...

ParaView + Alya + D8tree: Integrating high performance computing and high performance data analytics

Artigues, Antoni; Cugnasco, Cesare; Becerra Fontal, Yolanda; Cucchietti, Fernando; Houzeaux, Guillaume; Vázquez, Mariano; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Elsevier, 2017)
Article
Accés obert

Large scale time-dependent particle simulations can generate massive amounts of data, making it so that storing the results is often the slowest phase and the primary time bottleneck of the simulation. Furthermore, analysing ...

PARSECSs: Evaluating the impact of task parallelism in the PARSEC benchmark suite

Chasapis, Dimitrios; Casas, Marc; Moretó Planas, Miquel; Vidal Ortiz, Raul; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2015-12-01)
Article
Accés obert

In this work, we show how parallel applications can be implemented efficiently using task parallelism. We also evaluate the benefits of such parallel paradigm with respect to other approaches. We use the PARSEC benchmark ...

Performance analysis and optimization of the FFTXlib on the Intel knights landing architecture

Wagner, Michael; López, Victor; Morillo, Julian; Cavazzoni, Carlo; Affinito, Fabio; Gimenez, Judit; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2017)
Text en actes de congrés
Accés obert

In this paper, we address the decreasing performance of the FFTXlib, the Fast Fourier Transformation (FFT) kernel of Quantum ESPRESSO, when scaling to a full KNL node. An increased performance in the FFTXlib will likewise ...

Performance analysis of parallel Python applications

Wagner, Michael; Llort Sánchez, Germán Matias; Mercadal Melia, Estanislao; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Elsevier, 2017)
Text en actes de congrés
Accés obert

Python is progressively consolidating itself within the HPC community with its simple syntax, large standard library, and powerful third-party libraries for scientific computing that are especially attractive to domain ...

Performance data extrapolation in parallel codes

González García, Juan; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2010)
Text en actes de congrés
Accés restringit per política de l'editorial

Measuring the performance of parallel codes is a compromise between lots of factors. The most important one is which data has to be analyzed. Current supercomputers are able to run applications in large number of processors ...

Performance impact of the interconnection network on MareNostrum applications

Ramírez Bellido, Alejandro; Prat, Oriol; Labarta Mancho, Jesús José; Valero Cortés, Mateo (-, 2007)
Text en actes de congrés
Accés obert

Interconnection networks are one of the fundamental components of a supercomputing facility, and one of the most expensive parts. They represent one of the main differences between two supercomputers built from the same ...

POSTER: collective dynamic parallelism for directive based GPU programming languages and compilers

Ozen, Guray; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2016)
Text en actes de congrés
Accés restringit per política de l'editorial

Early programs for GPU (Graphics Processing Units) acceleration were based on a flat, bulk parallel programming model, in which programs had to perform a sequence of kernel launches from the host CPU. In the latest releases ...

POSTER: Exploiting asymmetric multi-core processors with flexible system sofware

Chronaki, Kallia; Moretó Planas, Miquel; Casas, Marc; Rico, Alejandro; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2016)
Comunicació de congrés
Accés obert

Energy efficiency has become the main challenge for high performance computing (HPC). The use of mobile asymmetric multi-core architectures to build future multi-core systems is an approach towards energy savings while ...

Power-aware load balancing of large scale MPI applications

Etinski, Maja; Corbalán González, Julita; Labarta Mancho, Jesús José; Valero Cortés, Mateo; Veidenbaum, Alex (Institute of Electrical and Electronics Engineers (IEEE), 2009)
Text en actes de congrés
Accés obert

Power consumption is a very important issue for HPC community, both at the level of one application or at the level of whole workload. Load imbalance of a MPI application can be exploited to save CPU energy without penalizing ...

Predicting MPI buffer addresses.

Freitag, Fèlix; Farreras Esclusa, Montserrat; Cortés, Toni; Labarta Mancho, Jesús José (Springer, 2004)
Comunicació de congrés
Accés obert

Communication latencies have been identified as one of the performance limiting factors of message passing applications in clusters of workstations/multiprocessors. On the receiver side, message-copying operations contribute ...

Productive cluster programming with OmpSs

Bueno Hedo, Javier; Martinell Andreu, Luis; Duran González, Alejandro; Farreras Esclusa, Montserrat; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2011)
Text en actes de congrés
Accés restringit per política de l'editorial

Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the productivity of MPI programmers is low because of the complexity of expressing parallelism and communication, and the difficulty ...

Productive programming of GPU clusters with OmpSs

Bueno Hedo, Javier; Planas, Judit; Duran González, Alejandro; Badia Sala, Rosa Maria; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2012)
Comunicació de congrés
Accés restringit per política de l'editorial

Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present ...

Programmable and scalable reductions on clusters

Ciesko, Jan; Bueno Hedo, Javier; Puzovic, Nikola; Ramírez Bellido, Alejandro; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2013)
Text en actes de congrés
Accés restringit per política de l'editorial

Reductions matter and they are here to stay. Wide adoption of parallel processing hardware in a broad range of computer applications has encouraged recent research efforts on their efficient parallelization. Furthermore, ...

Programmer-directed partial redundancy for resilient HPC

Subasi, Omer; Arias Moreno, Francisco Javier; Unsal, Osman Sabri; Labarta Mancho, Jesús José; Cristal Kestelman, Adrián (Association for Computing Machinery (ACM), 2015)
Text en actes de congrés
Accés restringit per política de l'editorial

In this work we propose partial task replication and check-pointing for task-parallel HPC applications to mitigate silent data corruption (SDC) errors. As the complete replication of all application tasks can be prohibitive ...

PyCOMPSs: Parallel computational workflows in Python

Tejedor, Enric; Becerra Fontal, Yolanda; Alomar, Guillem; Queralt Calafat, Anna; Badia Sala, Rosa Maria; Torres Viñals, Jordi; Cortés, Toni; Labarta Mancho, Jesús José (2017-01-01)
Article
Accés obert

The use of the Python programming language for scientific computing has been gaining momentum in the last years. The fact that it is compact and readable and its complete set of scientific libraries are two important ...

Random forest as a tumour genetic marker extractor

Pérez Arnal, Raquel Leandra; Garcia Gasulla, Dario; Torrents Rodas, David; Pares, Ferran; Cortés García, Claudio Ulises; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (IOS Press, 2019)
Text en actes de congrés
Accés obert

Finding tumour genetic markers is essential to biomedicine due to their relevance for cancer detection and therapy development. In this paper, we explore a recently released dataset of chromosome rearrangements in 2,586 ...

Reducción de la degradación y el conflicto en las redes de interconexión para sistemas multiprocesadores

Llaberia Griñó, José M.; Labarta Mancho, Jesús José; Herrada Lillo, Enrique; Valero Cortés, Mateo (Asociación Española de Informática y Automática, 1985)
Text en actes de congrés
Accés obert

Uno de los parámetros causante de una disminuación potencial de la eficiencia de un sistema multiprocesador es el tiempo de respuesta del subsistemas de memoria. En este trabajo se presentan diversas técnicas que mejoran ...

Reducing cache coherence traffic with hierarchical directory cache and NUMA-aware runtime scheduling

Caheny, Paul; Casas, Marc; Moretó Planas, Miquel; Gloaguen, Hervé; Saintes, Maxime; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2016)
Text en actes de congrés
Accés obert

Cache Coherent NUMA (ccNUMA) architectures are a widespread paradigm due to the benefits they provide for scaling core count and memory capacity. Also, the flat memory address space they offer considerably improves ...

Reducing data movement on large shared memory systems by exploiting computation dependencies

Barrera, I.S.; Ayguadé Parra, Eduard; Valero Cortés, Mateo; Moretó Planas, Miquel; Labarta Mancho, Jesús José; Casas, Marc (Association for Computing Machinery (ACM), 2018)
Text en actes de congrés
Accés obert

Shared memory systems are becoming increasingly complex as they typically integrate several storage devices. That brings different access latencies or bandwidth rates depending on the proximity between the cores where ...