En aquest grup s´investiga en tècniques que permeten millorar l´eficiència dels sistemes de computació d?altes prestacions. Aquest objectiu es tracta des de perspectives diverses que requereixen un cert grau de cooperació: arquitectura del sistema uniprocessador i multiprocessador, compilador, sistema operatiu, eines d´anàlisi, visualització i predicció, algorismes i aplicacions. Per mesurar l´eficiència es consideren mètriques que van més enllà del temps d´execució dels programes. En particular es consideren aspectes relacionats amb el disseny del sistema (cicle d´operació, àrea i consum de potència del processador i la jerarquia de memòria, escalabilitat de l´organització uniprocessador i multiprocessador), amb la verificació funcional dels sistemes, amb la facilitat i la portabilitat del model de programació i amb el rendiment en entorns multiprogramats i distribuïts, entre altres.

The group aims to improve the efficiency of high-performance computing systems. To that end, it employs a variety of approaches that require a certain level of cooperation and integration: microarchitecture and multiprocessor architecture, compilers, operating systems, analysis, visualisation and prediction tools, algorithms and applications. When measuring efficiency, in addition to the traditional approach that takes the execution time into account, we use metrics that consider design factors such as cycle time, area and power dissipation of the processor and memory hierarchy, scalability of the microarchitecture and multiprocessor organisation, system correctness, portability and ease of use of programming models, and performance when running on multiuser, multiprogrammed and distributed environments, among others.

The group aims to improve the efficiency of high-performance computing systems. To that end, it employs a variety of approaches that require a certain level of cooperation and integration: microarchitecture and multiprocessor architecture, compilers, operating systems, analysis, visualisation and prediction tools, algorithms and applications. When measuring efficiency, in addition to the traditional approach that takes the execution time into account, we use metrics that consider design factors such as cycle time, area and power dissipation of the processor and memory hierarchy, scalability of the microarchitecture and multiprocessor organisation, system correctness, portability and ease of use of programming models, and performance when running on multiuser, multiprogrammed and distributed environments, among others.

Envíos recientes

  • Data augmentation for deep learning of non-mydriatic screening retinal fundus images 

    Moya Sánchez, Eduardo Ulises; Sánchez Pérez, Abraham; Zapata Victori, Miguel Ángel; Moreno, Jonatan; Garcia Gasulla, Dario; Parés, Ferran; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Cortés García, Claudio Ulises (Springer, 2018)
    Texto en actas de congreso
    Acceso restringido por política de la editorial
    Fundus image is an effective and low-cost tool to screen for common retinal diseases. At the same time, Deep Learning (DL) algorithms have been shown capable of achieving similar or even better performance accuracies than ...
  • Towards resilient EU HPC systems: A blueprint 

    Radojkovic, Petar; Marazakis, Manolis; Carpenter, Paul Matthew; Jeyapaul, Reiley; Gizopoulos, Dimitris; Schulz, Martin; Armejach Sanosa, Adrià; Ayguadé Parra, Eduard; Canal Corretger, Ramon; Moreto Planas, Miquel; Salami, Behzad; Unsal, Osman Sabri (2020-04)
    Report de recerca
    Acceso abierto
    This document aims to spearhead a Europe-wide discussion on HPC system resilience and to help the European HPC community define best practices for resilience. We analyse a wide range of state-of-the-art resilience mechanisms ...
  • An experimental study of reduced-voltage operation in modern FPGAs for neural network acceleration 

    Salami, Behzad; Onural, Erhan Baturay; Yuksel, Ismail Emir; Koc, Fahrettin; Ergin, Oguz; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Sarbazi-Azad, Hamid; Mutlu, Onur (Institute of Electrical and Electronics Engineers (IEEE), 2020)
    Texto en actas de congreso
    Acceso abierto
    We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field ...
  • POSTER: SPiDRE: accelerating sparse memory access patterns 

    Barredo Ferreira, Adrián; Beard, Jonathan C.; Moreto Planas, Miquel (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Comunicación de congreso
    Acceso abierto
    Development in process technology has led to an exponential increase in processor speed and memory capacity. However, memory latencies have not improved as dramatically and represent a well-known problem in computer ...
  • An architecture model for a distributed virtualization system 

    Pessolani, Pablo; Tinetti, Fernando; Cortés, Toni; Gonnet, Silvio (International Academy, Research, and Industry Association (IARIA), 2018)
    Texto en actas de congreso
    Acceso restringido por política de la editorial
    This article presents an architecture model for a Distributed Virtualization System, which could expand a virtual execution environment from a single physical machine to several nodes of a cluster. With current virtualization ...
  • A toolchain to verify the parallelization of OmpSs-2 applications 

    Economo, Simone; Royuela Alcázar, Sara; Ayguadé Parra, Eduard; Beltran Querol, Vicenç (Springer, 2020)
    Texto en actas de congreso
    Acceso abierto
    Programming models for task-based parallelization based on compile-time directives are very effective at uncovering the parallelism available in HPC applications. Despite that, the process of correctly annotating complex ...
  • A novel FPGA-based high throughput accelerator for binary search trees 

    Melikoglu, Oyku; Ergin, Oguz; Salami, Behzad; Pavón Rivera, Julián; Unsal, Osman Sabri; Cristal Kestelman, Adrián (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Texto en actas de congreso
    Acceso abierto
    This paper presents a deeply pipelined and massively parallel Binary Search Tree (BST) accelerator for Field Programmable Gate Arrays (FPGAs). Our design relies on the extremely parallel on-chip memory, or Block RAMs (BRAMs) ...
  • Towards an auto-tuned and task-based SpMV (LASs Library) 

    Catalán Pallarés, Sandra; Usui, Tetsuzo; Toledo, Leonel; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Valero Lara, Pedro (Springer, 2020)
    Texto en actas de congreso
    Acceso abierto
    We present a novel approach to parallelize the SpMV kernel included in LASs (Linear Algebra routines on OmpSs) library, after a deep review and analysis of several well-known approaches. LASs is based on OmpSs, a task-based ...
  • An iris based lungs pre-diagnostic system 

    Hussain, Tassadaq; Haider, Amna; Muhammad, Abdul Malik; Agha, Areeb; Khan, Bilal; Rashid, Fawad; Raza, Muhammad Saad; Din, Moainud; Khan, Mehran; Ullah, Sami; Ahmed, Abdelmalik Taleb; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Texto en actas de congreso
    Acceso abierto
    Human lungs are essential respiratory organs. Different Obstructive Lung Diseases (OLD) such as bronchitis, asthma, lungs cancer etc. affects the respiration. Diagnosing OLD in the initial stage is better than diagnosing ...
  • Exceeding conservative limits: A consolidated analysis on modern hardware margins 

    Papadimitriou, George; Chatzidimitriou, Athanansios; Gizopoulos, Dimitris; Reddi, Vijay Janapa; Leng, Jingwen; Salami, Behzad; Unsal, Osman Sabri; Cristal Kestelman, Adrián (2020-06)
    Artículo
    Acceso abierto
    Modern large-scale computing systems (data centers, supercomputers, cloud and edge setups and high-end cyber-physical systems) employ heterogeneous architectures that consist of multicore CPUs, general-purpose many-core ...
  • Fast gap-affine pairwise alignment using the wavefront algorithm 

    Marco-Sola, Santiago; Moure López, Juan Carlos; Moreto Planas, Miquel; Espinosa Morales, Antonio (2020-09-11)
    Artículo
    Acceso abierto
    Motivation Pairwise alignment of sequences is a fundamental method in modern molecular biology, implemented within multiple bioinformatics tools and libraries. Current advances in sequencing technologies press for the ...
  • Asynchronous runtime with distributed manager for task-based programming models 

    Bosch Pons, Jaume; Álvarez Martínez, Carlos; Jiménez González, Daniel; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (2020-09)
    Artículo
    Acceso restringido por política de la editorial
    Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of ...

Muestra más