Recent Submissions

  • CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference 

    Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2019-09-01)
    Article
    Open Access
    Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations ...
  • 2018 International Symposium on Computer Architecture influential paper award 

    González Colás, Antonio María (2018-07-01)
    Article
    Open Access
    The International Symposium on Computer Architecture (ISCA) recognizes every year the most influential paper published in this conference 15 years earlier, based on its impact on research, development, products or ideas. ...
  • Performance analysis and optimization of automatic speech recognition 

    Tabani, Hamid; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (2018-10-01)
    Article
    Open Access
    Fast and accurate Automatic Speech Recognition (ASR) is emerging as a key application for mobile devices. Delivering ASR on such devices is challenging due to the compute-intensive nature of the problem and the power ...
  • SyRA: early system reliability analysis for cross-layer soft errors resilience in memory arrays of microprocessor systems 

    Vallero, Alessandro; Savino, Alessandro; Chatzidimitriou, Athanansios; Kaliorakis, Manolis; Kooli, Maha; Riera Villanueva, Marc; Di Natale, Giorgio; Bosio, Alberto; Canal Corretger, Ramon; Gizopoulos, Dimitris; Di Carlo, Stefano (Institute of Electrical and Electronics Engineers (IEEE), 2018-01-01)
    Article
    Open Access
    Cross-layer reliability is becoming the preferred solution when reliability is a concern in the design of a microprocessor-based system. Nevertheless, deciding how to distribute the error management across the different ...
  • Visibility rendering order: Improving energy efficiency on mobile GPUs through frame coherence 

    Lucas Casamayor, Enrique de; Marcuello Pascual, Pedro; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (2019-02-01)
    Article
    Open Access
    During real-time graphics rendering, objects are processed by the GPU in the order they are submitted by the CPU, and occluded surfaces are often processed even though they will end up not being part of the final image, ...
  • Strategies to enhance the 3T1D-DRAM cell variability robustness beyond 22 nm 

    Amat Bertran, Esteve; García Almudéver, Carmen; Aymerich, N.; Canal Corretger, Ramon; Rubio Sola, Jose Antonio (2014-10-01)
    Article
    Open Access
    3T1D cell has been stated as a valid alternative to be implemented on L1 memory cache to substitute 6T, highly affected by device variability as technology dimensions are reduced. In this work, we have shown that 22 nm ...
  • Low-power automatic speech recognition through a mobile GPU and a Viterbi accelerator 

    Yazdani Aminabadi, Reza; Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (2017-04-12)
    Article
    Open Access
    Automatic speech recognition (ASR) has become a core technology for mobile devices. Delivering real-time and accurate ASR has a huge computational cost, which is challenging to achieve in tightly energy-constrained platforms ...
  • Statistical analysis and comparison of 2T and 3T1D e-DRAM minimum energy operation 

    Rana, Manish; Canal Corretger, Ramon; Amat Bertran, Esteve; Rubio Sola, Jose Antonio (2017-03-01)
    Article
    Open Access
    Bio-medical wearable devices restricted to their small-capacity embedded-battery require energy-efficiency of the highest order. However, minimum-energy point (MEP) at sub-threshold voltages is unattainable with SRAM memory, ...
  • Executing algorithms with hypercube topology on torus multicomputers 

    González Colás, Antonio María; Valero García, Miguel; Díaz de Cerio Ripalda, Luis Manuel (1995-08)
    Article
    Open Access
    Many parallel algorithms use hypercubes as the communication topology among their processes. When such algorithms are executed on hypercube multicomputers the communication cost is kept minimum since processes can be ...
  • Exploiting narrow values for soft error tolerance 

    Ergin, Oguz; Unsal, Osman Sabri; Vera Rivera, Francisco Javier; González Colás, Antonio María (2006-07)
    Article
    Open Access
    Soft errors are an important challenge in contemporary microprocessors. Particle hits on the components of a processor are expected to create an increasing number of transient errors with each new microprocessor generation. ...
  • Reliability: fallacy or reality? 

    González Colás, Antonio María; Mahlke, Scott; Mukherjee, Shubu; Sendag, Resit; Chiou, Derek; Yi, Joshua J. (2007-11)
    Article
    Open Access
    As chip architects and manufacturers plumb ever-smaller process technologies, new species of faults are compromising device reliability, following an introduction by the authors debate whether reliability is a legitimate ...
  • Thread partitioning and value prediction for exploiting speculative thread-level parallelism 

    Marcuello, Pedro; González Colás, Antonio María; Tubella Murgadas, Jordi (2004-02)
    Article
    Open Access
    Speculative thread-level parallelism has been recently proposed as a source of parallelism to improve the performance in applications where parallel threads are hard to find. However, the efficiency of this execution model ...

View more