En aquest grup s´investiga en tècniques que permeten millorar l´eficiència dels sistemes de computació d?altes prestacions. Aquest objectiu es tracta des de perspectives diverses que requereixen un cert grau de cooperació: arquitectura del sistema uniprocessador i multiprocessador, compilador, sistema operatiu, eines d´anàlisi, visualització i predicció, algorismes i aplicacions. Per mesurar l´eficiència es consideren mètriques que van més enllà del temps d´execució dels programes. En particular es consideren aspectes relacionats amb el disseny del sistema (cicle d´operació, àrea i consum de potència del processador i la jerarquia de memòria, escalabilitat de l´organització uniprocessador i multiprocessador), amb la verificació funcional dels sistemes, amb la facilitat i la portabilitat del model de programació i amb el rendiment en entorns multiprogramats i distribuïts, entre altres.

http://futur.upc.edu/CAP

The group aims to improve the efficiency of high-performance computing systems. To that end, it employs a variety of approaches that require a certain level of cooperation and integration: microarchitecture and multiprocessor architecture, compilers, operating systems, analysis, visualisation and prediction tools, algorithms and applications. When measuring efficiency, in addition to the traditional approach that takes the execution time into account, we use metrics that consider design factors such as cycle time, area and power dissipation of the processor and memory hierarchy, scalability of the microarchitecture and multiprocessor organisation, system correctness, portability and ease of use of programming models, and performance when running on multiuser, multiprogrammed and distributed environments, among others.

http://futur.upc.edu/CAP

The group aims to improve the efficiency of high-performance computing systems. To that end, it employs a variety of approaches that require a certain level of cooperation and integration: microarchitecture and multiprocessor architecture, compilers, operating systems, analysis, visualisation and prediction tools, algorithms and applications. When measuring efficiency, in addition to the traditional approach that takes the execution time into account, we use metrics that consider design factors such as cycle time, area and power dissipation of the processor and memory hierarchy, scalability of the microarchitecture and multiprocessor organisation, system correctness, portability and ease of use of programming models, and performance when running on multiuser, multiprogrammed and distributed environments, among others.

http://futur.upc.edu/CAP

Enviaments recents

  • Auto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systems 

    Pérez, Borja; Stafford, Esteban; Bosque Orero, José Luis; Beivide Palacio, Ramon; Mateo Bellido, Sergi; Teruel Garcia, Javier; Martorell Bofill, Xavier; Ayguadé Parra, Eduard (2019-03-01)
    Article
    Accés obert
    The emergence of heterogeneous systems has been very notable recently. The nodes of the most powerful computers integrate several compute accelerators, like GPUs. Profiting from such node configurations is not a trivial ...
  • Structural methods for the synthesis of speed-independent circuits 

    Pastor Llorens, Enric; Cortadella, Jordi; Kondratyev, Alex; Roig Mansilla, Oriol (1998-11)
    Article
    Accés obert
    Asynchronous circuits can be modeled as concurrent systems in which events are interpreted as signal transitions. The synthesis of concurrent systems implies the analysis of a vast state space that often requires computationally ...
  • Towards mobile cloud computing with single sign-on access 

    Lordan Gomis, Francesc-Josep; Jensen, J. K.; Badia Sala, Rosa Maria (2017-10-30)
    Article
    Accés obert
    The low computing power of mobile devices impedes the development of mobile applications with a heavy computing load. Mobile Cloud Computing (MCC) has emerged as the solution to this by connecting mobile devices with the ...
  • GekkoFS: A temporary distributed file system for HPC applications 

    Vef, Marc-André; Moti, Nafiseh; Süb, Tim; Tocci, Tommaso; Nou, Ramon; Miranda, Alberto; Cortés, Toni; Brinkmann, Andre (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Text en actes de congrés
    Accés obert
    We present GekkoFS, a temporary, highly-scalable burst buffer file system which has been specifically optimized for new access patterns of data-intensive High-Performance Computing (HPC) applications. The file system ...
  • Runtime-assisted cache coherence deactivation in task parallel programs 

    Caheny, Paul; Álvarez, Lluc; Valero Cortés, Mateo; Moreto Planas, Miquel; Casas, Marc (Association for Computing Machinery (ACM), 2018)
    Text en actes de congrés
    Accés obert
    With increasing core counts, the scalability of directory-based cache coherence has become a challenging problem. To reduce the area and power needs of the directory, recent proposals reduce its size by classifying data ...
  • Stencil codes on a vector length agnostic architecture 

    Armejach Sanosa, Adrià; Caminal Pallarés, Helena; Cebrián González, Juan Manuel; González-Alberquilla, Rekai; Adeniyi-Jones, Chris; Valero Cortés, Mateo; Casas, Marc; Moreto Planas, Miquel (Association for Computing Machinery (ACM), 2018)
    Text en actes de congrés
    Accés obert
    Data-level parallelism is frequently ignored or underutilized. Achieved through vector/SIMD capabilities, it can provide substantial performance improvements on top of widely used techniques such as thread-level parallelism. ...
  • Improving the interoperability between MPI and task-based programming models 

    Sala, Kevin; Bellón, Jorge; Farré, Pau; Teruel, Xavier; Pérez, Josep M.; Peña, Antonio J.; Holmes, Daniel; Beltran, Vicenç; Labarta Mancho, Jesús José (Association for Computing Machinery (ACM), 2018)
    Text en actes de congrés
    Accés obert
    In this paper we propose an API to pause and resume task execution depending on external events. We leverage this generic API to improve the interoperability between MPI synchronous communication primitives and tasks. When ...
  • Runtime-guided management of stacked DRAM memories in task parallel programs 

    Álvarez, Lluc; Casas, Marc; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Valero Cortés, Mateo; Moreto Planas, Miquel (Association for Computing Machinery (ACM), 2018)
    Text en actes de congrés
    Accés obert
    Stacked DRAM memories have become a reality in High-Performance Computing (HPC) architectures. These memories provide much higher bandwidth while consuming less power than traditional off-chip memories, but their limited ...
  • Reducing data movement on large shared memory systems by exploiting computation dependencies 

    Barrera, I.S.; Ayguadé Parra, Eduard; Valero Cortés, Mateo; Moreto Planas, Miquel; Labarta Mancho, Jesús José; Casas Guix, Marc (Association for Computing Machinery (ACM), 2018)
    Text en actes de congrés
    Accés restringit per política de l'editorial
    Shared memory systems are becoming increasingly complex as they typically integrate several storage devices. That brings different access latencies or bandwidth rates depending on the proximity between the cores where ...
  • Evaluation of A+B=K conditions without carry propagation 

    Cortadella, Jordi; Llaberia Griñó, José M. (Institute of Electrical and Electronics Engineers (IEEE), 1992-11)
    Article
    Accés obert
    The response time of parallel adders is mainly determined by the carry propagation delay. The evaluation of conditions of the type A+B=K is addressed. Although an addition is involved in the comparison, it is shown that ...
  • Variable batched DGEMM 

    Valero-Lara, Pedro; Martinez-Perez, Ivan; Mateo, Sergio; Sirvent Pardell, Raül; Beltran Querol, Vicenç; Martorell Bofill, Xavier; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Text en actes de congrés
    Accés restringit per política de l'editorial
    Many scientific applications are in need to solve a high number of small-size independent problems. These individual problems do not provide enough parallelism and then, these must be computed as a batch. Today, vendors ...
  • Performance characterization of spark workloads on shared NUMA Systems 

    Baig, Shuja Ur Rehman; Amaral, Marcelo; Polo Cantero, José; Carrera Pérez, David (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Text en actes de congrés
    Accés restringit per política de l'editorial
    As the adoption of Big Data technologies becomes the norm in an increasing number of scenarios, there is also a growing need to optimize them for modern processors. Spark has gained momentum over the last few years among ...

Mostra'n més