En aquest grup s´investiga en tècniques que permeten millorar l´eficiència dels sistemes de computació d?altes prestacions. Aquest objectiu es tracta des de perspectives diverses que requereixen un cert grau de cooperació: arquitectura del sistema uniprocessador i multiprocessador, compilador, sistema operatiu, eines d´anàlisi, visualització i predicció, algorismes i aplicacions. Per mesurar l´eficiència es consideren mètriques que van més enllà del temps d´execució dels programes. En particular es consideren aspectes relacionats amb el disseny del sistema (cicle d´operació, àrea i consum de potència del processador i la jerarquia de memòria, escalabilitat de l´organització uniprocessador i multiprocessador), amb la verificació funcional dels sistemes, amb la facilitat i la portabilitat del model de programació i amb el rendiment en entorns multiprogramats i distribuïts, entre altres.

The group aims to improve the efficiency of high-performance computing systems. To that end, it employs a variety of approaches that require a certain level of cooperation and integration: microarchitecture and multiprocessor architecture, compilers, operating systems, analysis, visualisation and prediction tools, algorithms and applications. When measuring efficiency, in addition to the traditional approach that takes the execution time into account, we use metrics that consider design factors such as cycle time, area and power dissipation of the processor and memory hierarchy, scalability of the microarchitecture and multiprocessor organisation, system correctness, portability and ease of use of programming models, and performance when running on multiuser, multiprogrammed and distributed environments, among others.

The group aims to improve the efficiency of high-performance computing systems. To that end, it employs a variety of approaches that require a certain level of cooperation and integration: microarchitecture and multiprocessor architecture, compilers, operating systems, analysis, visualisation and prediction tools, algorithms and applications. When measuring efficiency, in addition to the traditional approach that takes the execution time into account, we use metrics that consider design factors such as cycle time, area and power dissipation of the processor and memory hierarchy, scalability of the microarchitecture and multiprocessor organisation, system correctness, portability and ease of use of programming models, and performance when running on multiuser, multiprogrammed and distributed environments, among others.

Recent Submissions

  • MPI+OpenMP tasking scalability for multi-morphology simulations of the human brain 

    Valero-Lara, Pedro; Sirvent, Raül; Peña, Antonio J.; Labarta Mancho, Jesús José (2019-05)
    Article
    Restricted access - publisher's policy
    The simulation of the behavior of the human brain is one of the most ambitious challenges today with a non-end of important applications. We can find many different initiatives in the USA, Europe and Japan which attempt ...
  • A hardware runtime for task-based programming models 

    Tan, Xubin; Bosch, Jaume; Álvarez, Carlos; Jiménez González, Daniel; Ayguadé Parra, Eduard; Valero Cortés, Mateo (2019-09-01)
    Article
    Open Access
    Task-based programming models such as OpenMP 5.0 and OmpSs are simple to use and powerful enough to exploit task parallelism of applications over multicore, manycore and heterogeneous systems. However, their software-only ...
  • On the benefits of tasking with OpenMP 

    Rico, Alejandro; Sánchez Barrera, Isaac; Joao, Jose A.; Randall, Joshua; Casas, Marc; Moreto Planas, Miquel (Springer, 2019)
    Conference report
    Restricted access - publisher's policy
    Tasking promises a model to program parallel applications that provides intuitive semantics. In the case of tasks with dependences, it also promises better load balancing by removing global synchronizations (barriers), and ...
  • Distributed training of deep neural networks with spark: The MareNostrum experience 

    Cruz, Leonel; Tous Liesa, Rubén; Otero Calviño, Beatriz (Elsevier, 2019-07-01)
    Article
    Restricted access - publisher's policy
    Deployment of a distributed deep learning technology stack on a large parallel system is a very complex process, involving the integration and configuration of several layers of both, general-purpose and custom software. ...
  • Using Arm’s scalable vector extension on stencil codes 

    Armejach Sanosa, Adrià; Caminal Pallarés, Helena; Cebrián González, Juan Manuel; Langarita, Rubén; González-Alberquilla, Rekai; Adeniyi-Jones, Chris; Valero Cortés, Mateo; Casas Guix, Marc; Moreto Planas, Miquel (2019-04-08)
    Article
    Restricted access - publisher's policy
    Data-level parallelism is frequently ignored or underutilized. Achieved through vector/SIMD capabilities, it can provide substantial performance improvements on top of widely used techniques such as thread-level parallelism. ...
  • Accelerating hyperparameter optimisation with PyCOMPSs 

    Kahira, Albert Njoroge; Bautista Gomez, Leonardo Arturo; Conejero, Javier; Badia Sala, Rosa Maria (Association for Computing Machinery (ACM), 2019)
    Conference report
    Open Access
    Machine Learning applications now span across multiple domains due to the increase in computational power of modern systems. There has been a recent surge in Machine Learning applications in High Performance Computing (HPC) ...
  • Task Packing: Efficient task scheduling in unbalanced parallel programs to maximize CPU utilization 

    Utrera Iglesias, Gladys Miriam; Farreras Esclusa, Montse; Fornés de Juan, Jordi (Elsevier, 2019-12)
    Article
    Restricted access - publisher's policy
    Load imbalance in parallel systems can be generated by external factors to the currently running applications like operating system noise or the underlying hardware like a heterogeneous cluster. HPC applications working ...
  • Holistic slowdown driven scheduling and resource management for malleable jobs 

    D'Amico, Marco; Jokanovic, Ana; Corbalán González, Julita (Association for Computing Machinery (ACM), 2019)
    Conference report
    Open Access
    In job scheduling, the concept of malleability has been explored since many years ago. Research shows that malleability improves system performance, but its utilization in HPC never became widespread. The causes are the ...
  • The cooperative parallel: A discussion about run-time schedulers for nested parallelism 

    Royuela, Sara; Serrano, Maria A.; García Gasulla, Marta; Mateo Bellido, Sergi; Labarta Mancho, Jesús José; Quiñones Moreno, Eduardo (Springer, 2019)
    Conference report
    Open Access
    Nested parallelism is a well-known parallelization strategy to exploit irregular parallelism in HPC applications. This strategy also fits in critical real-time embedded systems, composed of a set of concurrent functionalities. ...
  • Artificial neural networks as emerging tools for earthquake detection 

    Rojas, Otilio; Otero Calviño, Beatriz; Alvarado, Leonardo; Mus, Sergi; Tous Liesa, Rubén (2019)
    Article
    Open Access
    As seismic networks continue to spread and monitoring sensors become more ef¿cient, the abundance of data highly surpasses the processing capabilities of earthquake interpretation analysts. Earthquake catalogs are fundamental ...
  • Assembling a high-productivity DSL for computational fluid dynamics 

    Macià, Sandra; Martínez-Ferrer, Pedro J.; Mateo, Sergi; Beltran Querol, Vicenç; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2019)
    Conference report
    Open Access
    As we move towards exascale computing, an abstraction for effective parallel computation is increasingly needed to overcome the maintainability and portability of scientific applications while ensuring the efficient and ...
  • Wav2Pix: speech-conditioned face generation using generative adversarial networks 

    Cardoso Duarte, Amanda; Roldan, Francisco; Tubau, Miquel; Escur, Janna; Pascual de la Puente, Santiago; Salvador Aguilera, Amaia; Mohedano, Eva; McGuinness, Kevin; Torres Viñals, Jordi; Giró Nieto, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference lecture
    Restricted access - publisher's policy
    Speech is a rich biometric signal that contains information about the identity, gender and emotional state of the speaker. In this work, we explore its potential to generate face images of a speaker by conditioning a ...

View more