Exploració per autor "Labarta Mancho, Jesús José"
Ara es mostren els items 137-156 de 217
-
Parallelizing dense and banded linear algebra libraries using SMPSs
Badia Sala, Rosa Maria; Herrero Zaragoza, José Ramón; Labarta Mancho, Jesús José; Pérez Cáncer, Josep Maria; Quintana Ortí, Enrique Salvador; Quintana Ortí, Gregorio (2009-12-25)
Article
Accés restringit per política de l'editorialThe promise of future many-core processors, with hundreds of threads running concurrently, has led the developers of linear algebra libraries to rethink their design in order to extract more parallelism, further exploit ... -
ParaView + Alya + D8tree: Integrating high performance computing and high performance data analytics
Artigues, Antoni; Cugnasco, Cesare; Becerra Fontal, Yolanda; Cucchietti, Fernando; Houzeaux, Guillaume; Vázquez, Mariano; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Elsevier, 2017)
Article
Accés obertLarge scale time-dependent particle simulations can generate massive amounts of data, making it so that storing the results is often the slowest phase and the primary time bottleneck of the simulation. Furthermore, analysing ... -
PARSECSs: Evaluating the impact of task parallelism in the PARSEC benchmark suite
Chasapis, Dimitrios; Casas, Marc; Moretó Planas, Miquel; Vidal Ortiz, Raul; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2015-12-01)
Article
Accés obertIn this work, we show how parallel applications can be implemented efficiently using task parallelism. We also evaluate the benefits of such parallel paradigm with respect to other approaches. We use the PARSEC benchmark ... -
Performance analysis and optimization of the FFTXlib on the Intel knights landing architecture
Wagner, Michael; López, Victor; Morillo, Julian; Cavazzoni, Carlo; Affinito, Fabio; Gimenez, Judit; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2017)
Text en actes de congrés
Accés obertIn this paper, we address the decreasing performance of the FFTXlib, the Fast Fourier Transformation (FFT) kernel of Quantum ESPRESSO, when scaling to a full KNL node. An increased performance in the FFTXlib will likewise ... -
Performance analysis of parallel Python applications
Wagner, Michael; Llort Sánchez, Germán Matias; Mercadal Melia, Estanislao; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Elsevier, 2017)
Text en actes de congrés
Accés obertPython is progressively consolidating itself within the HPC community with its simple syntax, large standard library, and powerful third-party libraries for scientific computing that are especially attractive to domain ... -
Performance data extrapolation in parallel codes
González García, Juan; Giménez Lucas, Judit; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2010)
Text en actes de congrés
Accés restringit per política de l'editorialMeasuring the performance of parallel codes is a compromise between lots of factors. The most important one is which data has to be analyzed. Current supercomputers are able to run applications in large number of processors ... -
Performance impact of the interconnection network on MareNostrum applications
Ramírez Bellido, Alejandro; Prat, Oriol; Labarta Mancho, Jesús José; Valero Cortés, Mateo (-, 2007)
Text en actes de congrés
Accés obertInterconnection networks are one of the fundamental components of a supercomputing facility, and one of the most expensive parts. They represent one of the main differences between two supercomputers built from the same ... -
POSTER: collective dynamic parallelism for directive based GPU programming languages and compilers
Ozen, Guray; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2016)
Text en actes de congrés
Accés restringit per política de l'editorialEarly programs for GPU (Graphics Processing Units) acceleration were based on a flat, bulk parallel programming model, in which programs had to perform a sequence of kernel launches from the host CPU. In the latest releases ... -
POSTER: Exploiting asymmetric multi-core processors with flexible system sofware
Chronaki, Kallia; Moretó Planas, Miquel; Casas, Marc; Rico, Alejandro; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2016)
Comunicació de congrés
Accés obertEnergy efficiency has become the main challenge for high performance computing (HPC). The use of mobile asymmetric multi-core architectures to build future multi-core systems is an approach towards energy savings while ... -
Power-aware load balancing of large scale MPI applications
Etinski, Maja; Corbalán González, Julita; Labarta Mancho, Jesús José; Valero Cortés, Mateo; Veidenbaum, Alex (Institute of Electrical and Electronics Engineers (IEEE), 2009)
Text en actes de congrés
Accés obertPower consumption is a very important issue for HPC community, both at the level of one application or at the level of whole workload. Load imbalance of a MPI application can be exploited to save CPU energy without penalizing ... -
Predicting MPI buffer addresses.
Freitag, Fèlix; Farreras Esclusa, Montserrat; Cortés, Toni; Labarta Mancho, Jesús José (Springer, 2004)
Comunicació de congrés
Accés obertCommunication latencies have been identified as one of the performance limiting factors of message passing applications in clusters of workstations/multiprocessors. On the receiver side, message-copying operations contribute ... -
Productive cluster programming with OmpSs
Bueno Hedo, Javier; Martinell Andreu, Luis; Duran González, Alejandro; Farreras Esclusa, Montserrat; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2011)
Text en actes de congrés
Accés restringit per política de l'editorialClusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the productivity of MPI programmers is low because of the complexity of expressing parallelism and communication, and the difficulty ... -
Productive programming of GPU clusters with OmpSs
Bueno Hedo, Javier; Planas, Judit; Duran González, Alejandro; Badia Sala, Rosa Maria; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2012)
Comunicació de congrés
Accés restringit per política de l'editorialClusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present ... -
Programmable and scalable reductions on clusters
Ciesko, Jan; Bueno Hedo, Javier; Puzovic, Nikola; Ramírez Bellido, Alejandro; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2013)
Text en actes de congrés
Accés restringit per política de l'editorialReductions matter and they are here to stay. Wide adoption of parallel processing hardware in a broad range of computer applications has encouraged recent research efforts on their efficient parallelization. Furthermore, ... -
Programmer-directed partial redundancy for resilient HPC
Subasi, Omer; Arias Moreno, Francisco Javier; Unsal, Osman Sabri; Labarta Mancho, Jesús José; Cristal Kestelman, Adrián (Association for Computing Machinery (ACM), 2015)
Text en actes de congrés
Accés restringit per política de l'editorialIn this work we propose partial task replication and check-pointing for task-parallel HPC applications to mitigate silent data corruption (SDC) errors. As the complete replication of all application tasks can be prohibitive ... -
PyCOMPSs: Parallel computational workflows in Python
Tejedor, Enric; Becerra Fontal, Yolanda; Alomar, Guillem; Queralt Calafat, Anna; Badia Sala, Rosa Maria; Torres Viñals, Jordi; Cortés, Toni; Labarta Mancho, Jesús José (2017-01-01)
Article
Accés obertThe use of the Python programming language for scientific computing has been gaining momentum in the last years. The fact that it is compact and readable and its complete set of scientific libraries are two important ... -
Random forest as a tumour genetic marker extractor
Pérez Arnal, Raquel Leandra; Garcia Gasulla, Dario; Torrents Rodas, David; Pares, Ferran; Cortés García, Claudio Ulises; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (IOS Press, 2019)
Text en actes de congrés
Accés obertFinding tumour genetic markers is essential to biomedicine due to their relevance for cancer detection and therapy development. In this paper, we explore a recently released dataset of chromosome rearrangements in 2,586 ... -
Reducción de la degradación y el conflicto en las redes de interconexión para sistemas multiprocesadores
Llaberia Griñó, José M.; Labarta Mancho, Jesús José; Herrada Lillo, Enrique; Valero Cortés, Mateo (Asociación Española de Informática y Automática, 1985)
Text en actes de congrés
Accés obertUno de los parámetros causante de una disminuación potencial de la eficiencia de un sistema multiprocesador es el tiempo de respuesta del subsistemas de memoria. En este trabajo se presentan diversas técnicas que mejoran ... -
Reducing cache coherence traffic with hierarchical directory cache and NUMA-aware runtime scheduling
Caheny, Paul; Casas, Marc; Moretó Planas, Miquel; Gloaguen, Hervé; Saintes, Maxime; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2016)
Text en actes de congrés
Accés obertCache Coherent NUMA (ccNUMA) architectures are a widespread paradigm due to the benefits they provide for scaling core count and memory capacity. Also, the flat memory address space they offer considerably improves ... -
Reducing data movement on large shared memory systems by exploiting computation dependencies
Barrera, I.S.; Ayguadé Parra, Eduard; Valero Cortés, Mateo; Moretó Planas, Miquel; Labarta Mancho, Jesús José; Casas, Marc (Association for Computing Machinery (ACM), 2018)
Text en actes de congrés
Accés obertShared memory systems are becoming increasingly complex as they typically integrate several storage devices. That brings different access latencies or bandwidth rates depending on the proximity between the cores where ...