En aquest grup s´investiga en tècniques que permeten millorar l´eficiència dels sistemes de computació d?altes prestacions. Aquest objectiu es tracta des de perspectives diverses que requereixen un cert grau de cooperació: arquitectura del sistema uniprocessador i multiprocessador, compilador, sistema operatiu, eines d´anàlisi, visualització i predicció, algorismes i aplicacions. Per mesurar l´eficiència es consideren mètriques que van més enllà del temps d´execució dels programes. En particular es consideren aspectes relacionats amb el disseny del sistema (cicle d´operació, àrea i consum de potència del processador i la jerarquia de memòria, escalabilitat de l´organització uniprocessador i multiprocessador), amb la verificació funcional dels sistemes, amb la facilitat i la portabilitat del model de programació i amb el rendiment en entorns multiprogramats i distribuïts, entre altres.

The group aims to improve the efficiency of high-performance computing systems. To that end, it employs a variety of approaches that require a certain level of cooperation and integration: microarchitecture and multiprocessor architecture, compilers, operating systems, analysis, visualisation and prediction tools, algorithms and applications. When measuring efficiency, in addition to the traditional approach that takes the execution time into account, we use metrics that consider design factors such as cycle time, area and power dissipation of the processor and memory hierarchy, scalability of the microarchitecture and multiprocessor organisation, system correctness, portability and ease of use of programming models, and performance when running on multiuser, multiprogrammed and distributed environments, among others.

The group aims to improve the efficiency of high-performance computing systems. To that end, it employs a variety of approaches that require a certain level of cooperation and integration: microarchitecture and multiprocessor architecture, compilers, operating systems, analysis, visualisation and prediction tools, algorithms and applications. When measuring efficiency, in addition to the traditional approach that takes the execution time into account, we use metrics that consider design factors such as cycle time, area and power dissipation of the processor and memory hierarchy, scalability of the microarchitecture and multiprocessor organisation, system correctness, portability and ease of use of programming models, and performance when running on multiuser, multiprogrammed and distributed environments, among others.

Recent Submissions

  • The MAMe dataset: On the relevance of high resolution and variable shape image properties 

    Parés Pont, Ferran; Arias Duart, Anna; Garcia Gasulla, Dario; Campo Francés, Gema; Viladrich Iglesias, Nina; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Springer, 2022-08)
    Article
    Open Access
    The mostcommon approach in image classification tasks is to resize all images in the dataset to a unique shape, while reducing their resolution to a size that makes experimentation at scale easier. This practice has benefits ...
  • Can we trust undervolting in FPGA-based deep learning designs at harsh conditions? 

    Koc, Fahrettin; Salami, Behzad; Ergin, Oguz; Unsal, Osman Sabri; Cristal Kestelman, Adrián (2022-05)
    Article
    Open Access
    As more Neural Networks on Field Programmable Gate Arrays (FPGAs) are used in a wider context, the importance of power efficiency increases. However, the focus on power should never compromise application accuracy. One ...
  • Data prefetching on in-order processors 

    Ortega Carrasco, Cristobal; García Flores, Víctor; Moretó Planas, Miquel; Casas, Marc; Rositoru, Roxana (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Conference report
    Open Access
    Low-power processors have attracted attention due to their energy-efficiency. A large market, such as the mobile one, relies on these processors for this very reason. Even High Performance Computing (HPC) systems are ...
  • Transparent load balancing of MPI programs using OmpSs-2@Cluster and DLB 

    Aguilar Mena, Jimmy; Ali, Omar Shaaban Ibrahim; López Herrero, Víctor; Garcia Casulla, Marta; Carpenter, Paul Matthew; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2022)
    Conference report
    Open Access
    Load imbalance is a long-standing source of inefficiency in high performance computing. The situation has only got worse as applications and systems increase in complexity, e.g., adaptive mesh refinement, DVFS, memory ...
  • Automatic aggregation of subtask accesses for nested OpenMP-style tasks 

    Ali, Omar Shaaban Ibrahim; Aguilar Mena, Jimmy; Beltran Querol, Vicenç; Carpenter, Paul Matthew; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Conference report
    Open Access
    Task-based programming is a high performance and productive model to express parallelism. Tasks encapsulate work to be executed across multiple cores or offloaded to GPUs, FPGAs, other accelerators or other nodes. In order ...
  • An extension of the StarSs programming model for platforms with multiple GPUs 

    Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Igual Peña, Francisco D.; Labarta Mancho, Jesús José; Mayo Gual, Rafael; Quintana Ortí, Enrique Salvador (Springer, 2009)
    Conference lecture
    Open Access
    While general-purpose homogeneous multi-core architectures are becoming ubiquitous, there are clear indications that, for a number of important applications, a better performance/power ratio can be attained using specialized ...
  • Space compression algorithms acceleration on embedded multi-core and GPU platforms 

    Jover Álvarez, Álvaro; Rodríguez Ferrández, Iván; Kosmidis, Leonidas; Steenari, David (Association for Computing Machinery (ACM), 2022)
    Conference lecture
    Open Access
    Future space missions will require increased on-board computing power to process and compress massive amounts of data. Consequently, embedded multi-core and GPU platforms are considered, which have been shown beneficial ...
  • Analyzing the performance of hierarchical collective algorithms on ARM-based multicore clusters 

    Utrera Iglesias, Gladys Miriam; Gil, Marisa; Martorell Bofill, Xavier (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Conference lecture
    Open Access
    MPI is the de facto communication standard library for parallel applications in distributed memory architectures. Collective operations performance is critical in HPC applications as they can become the bottleneck of their ...
  • Tuning dynamic web applications using fine-grain analysis 

    Guitart Fernández, Jordi; Carrera Pérez, David; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2005)
    Conference report
    Open Access
    In this paper we present a methodology to analyze the behavior and performance of Java application servers using a performance analysis framework. This framework, considers all levels involved in the application server ...
  • Soporte para el análisis de workloads en el proyecto eNANOS 

    Rodero Castro, Iván; Corbalán González, Julita; Duran González, Alejandro; Labarta Mancho, Jesús José (2005)
    Conference report
    Open Access
    El proyecto eNANOS plantea la planificación coordinada de trabajos entre varios niveles, desde el entorno heterogéneo y dinámico de un Grid hasta la ejecución de procesos y threads en las CPU’s de un computador o un cluster. ...
  • Optimizing NANOS OpenMP for the IBM Cyclops multithreaded architecture 

    Ródenas Picó, David; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Almási, George; Cascaval, Calin; Castaños, José G.; Moreira, Jose E. (Institute of Electrical and Electronics Engineers (IEEE), 2005)
    Conference report
    Open Access
    In this paper, we present two approaches to improve the execution of OpenMP applications on the IBM Cyclops multithreaded architecture. Both solutions are independent and they are focused to obtain better performance through ...
  • WAS control center: an autonomic performance-triggered tracing environment for WebSphere 

    Carrera Pérez, David; García, David; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2005)
    Conference report
    Open Access
    Studying any aspect of an application server with high availability requirements can become a tedious task when a continuous monitoring of the server status is necessary. The creation of performance-driven autonomic systems ...

View more