Now showing items 1-12 of 57

    • MetH: A family of high-resolution and variable-shape image challenges 

      Parés Pont, Ferran; Garcia Gasulla, Dario; Servat, Harald; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (2019-11-20)
      External research report
      Open Access
      High-resolution and variable-shape images have not yet been properly addressed by the AI community. The approach of down-sampling data often used with convolutional neural networks is sub-optimal for many tasks, and has ...
    • Practical experience with Nebelung: the runtime support for transactional memory and OpenMP 

      Milovanovic, Milos; Ferrer, Roger; Gajinov, Vladimir; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (2007-08)
      External research report
      Open Access
      Transactional Memory (TM) is a key future technology for emerging many-cores. On the other hand, OpenMP provides a vast established base for writing parallel programs, especially for scientific applications. Combining TM ...
    • The MS-processor's register file timing and power evaluation 

      Gonzalez Martin, Isidro; Cristal Kestelman, Adrián; Veindenbaum, Alex; Ramírez, Marco Antonio; Valero Cortés, Mateo (2008-09)
      External research report
      Open Access
      Power evaluation is an important issue in new proposal chip level architectures due to the big amount of power is dissipated as head and chips have limited head dissipation capacity. The evaluation shown in this technical ...
    • Systolic implementation for deconvolution iterative algorithm 

      Navarro Guerrero, Juan José; Casares Giner, Vicente (1985)
      External research report
      Open Access
      Systolic architectures implement regular algorithms in hardware, in order to obtain high computational throughput. In this paper we provide a modular architecture for a deconvolution iterative algorithm. The basic module ...
    • The Mont-Blanc prototype: an alternative approach for high-performance computing systems 

      Rajovic, Nikola; Ramírez Bellido, Alejandro; Rico, Alejandro; Mantovani, Filippo; Ruiz, Daniel; Villarubi, Oriol; Gómez, Constantino; Backes, Luna; Nieto, Diego; Servat, Harald; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Valero Cortés, Mateo; Adeniyi-Jones, Chris; Derradji, Said; Gloaguen, Hervé; Lanucara, Piero; Sanna, Nico; Mehaut, Jean-François; Pouget, Kevin; Videau, Brice; Boyer, Eric; Allalen, Momme; Auweter, Axel; Brayford, David; Tafani, Daniele; Brömmel, Dirk; Halver, René; Meinke, Jan H.; Beivide Palacio, Ramon; Benito, Mariano; Vallejo, Enrique (2016)
      External research report
      Open Access
      High-performance computing (HPC) is recognized as one of the pillars for further advance of science, industry, medicine, and education. Current HPC systems are being developed to overcome emerging challenges in order to ...
    • Exploiting asynchrony from exact forward recovery for DUE in iterative solvers 

      Jaulmes, Luc Etienne; Casas Guix, Marc; Moreto Planas, Miquel; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2015)
      External research report
      Open Access
      This paper presents a method to protect iterative solvers from Detected and Uncorrected Errors (DUE) relying on error detection techniques already available in commodity hardware. Detection operates at the memory page ...
    • Commit on overflow 

      Stipic, Srdjan; Armejach, Adrià; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo (2014)
      External research report
      Open Access
      Current commercial CPUs have hardware support for speculative lock elision (SLE). SLE tries to elide the lock by speculatively executing lock protected critical section. If the speculation fails, SLE acquires the lock and ...
    • Per-task energy accounting in computing systems 

      Liu, Qixiao; Jiménez, Víctor; Moreto Planas, Miquel; Abella Ferrer, Jaume; Cazorla, Francisco; Valero Cortés, Mateo (2013)
      External research report
      Open Access
      We present for the first time the concept of per-task energy accounting (PTEA) and relate it to per-task energy metering (PTEM). We show the benefits of supporting both in future computing systems. Using the shared last-level ...
    • CUsched: multiprogrammed workload scheduling on GPU architectures 

      Tanasic, Ivan; Gelado Fernandez, Isaac; Cabezas, Javier; Navarro, Nacho; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2013)
      External research report
      Open Access
      Graphic Processing Units (GPUs) are currently widely used in High Performance Computing (HPC) applications to speed-up the execution of massively-parallel codes. GPUs are well-suited for such HPC environments because ...
    • Solving matrix problems with no size restriction on a systolic array processor 

      Navarro Guerrero, Juan José; Llaberia Griñó, José M.; Valero Cortés, Mateo (1986)
      External research report
      Open Access
      In this paper we propose several data structures partitioning and transformation schemes, in order to get an efficient execution of various matrix algorithms without any size resriction. The following matrix operations are ...
    • Computing size-independent matrix problems on systolic array processors 

      Navarro Guerrero, Juan José; Llaberia Griñó, José M.; Valero Cortés, Mateo (1985)
      External research report
      Open Access
      A methodology to transform dense to banded matrices is presented in this paper. This transformation, is accomplished by triangular blocks partitioning, and allows the implementation of silutions to problems with any given ...
    • Keeping control transfer instructions out of the pipeline in architectures without condition codes 

      Cortadella, Jordi; Llaberia Griñó, José M.; González Colás, Antonio María (1987-05)
      External research report
      Open Access
      The execution of branch instructions involves a loss of performance in pipelined processors. In this paper we present a mechanism for executing this kind of instruction with a zero delay. This mechanism has been proposed ...