Now showing items 1-12 of 228

  • A low-power, high-performance speech recognition accelerator 

    Yazdani, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019-12-01)
    Article
    Open Access
    Automatic Speech Recognition (ASR) is becoming increasingly ubiquitous, especially in the mobile segment. Fast and accurate ASR comes at high energy cost, not being affordable for the tiny power-budgeted mobile devices. ...
  • CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference 

    Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2019-09-01)
    Article
    Open Access
    Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations ...
  • Rendering elimination: early discard of redundant tiles in the graphics pipeline 

    Anglada Sánchez, Martí; de Lucas Casamayor, Enrique; Parcerisa Bundó, Joan Manuel; Aragón, Juan Luis; Marcuello Pascual, Pedro; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference report
    Restricted access - publisher's policy
    GPUs are one of the most energy-consuming components for real-time rendering applications, since a large number of fragment shading computations and memory accesses are involved. Main memory bandwidth is especially taxing ...
  • Early visibility resolution for removing ineffectual computations in the graphics pipeline 

    Anglada Sánchez, Martí; de Lucas Casamayor, Enrique; Parcerisa Bundó, Joan Manuel; Aragón Alcaraz, Juan Luis; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference report
    Restricted access - publisher's policy
    GPUs' main workload is real-time image rendering. These applications take a description of a (animated) scene and produce the corresponding image(s). An image is rendered by computing the colors of all its pixels. It is ...
  • 2018 International Symposium on Computer Architecture influential paper award 

    González Colás, Antonio María (2018-07-01)
    Article
    Open Access
    The International Symposium on Computer Architecture (ISCA) recognizes every year the most influential paper published in this conference 15 years earlier, based on its impact on research, development, products or ideas. ...
  • Performance analysis and optimization of automatic speech recognition 

    Tabani, Hamid; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (2018-10-01)
    Article
    Open Access
    Fast and accurate Automatic Speech Recognition (ASR) is emerging as a key application for mobile devices. Delivering ASR on such devices is challenging due to the compute-intensive nature of the problem and the power ...
  • E-PUR: an energy-efficient processing unit for recurrent neural networks 

    Silfa Feliz, Franyell Antonio; Dot, Gem; Arnau Montañés, José María; González Colás, Antonio María (2018)
    Conference report
    Restricted access - publisher's policy
    Recurrent Neural Networks (RNNs) are a key technology for emerging applications such as automatic speech recognition, machine translation or image description. Long Short Term Memory (LSTM) networks are the most successful ...
  • SyRA: early system reliability analysis for cross-layer soft errors resilience in memory arrays of microprocessor systems 

    Vallero, Alessandro; Savino, Alessandro; Chatzidimitriou, Athanansios; Kaliorakis, Manolis; Kooli, Maha; Riera Villanueva, Marc; Di Natale, Giorgio; Bosio, Alberto; Canal Corretger, Ramon; Gizopoulos, Dimitris; Di Carlo, Stefano (Institute of Electrical and Electronics Engineers (IEEE), 2018-01-01)
    Article
    Open Access
    Cross-layer reliability is becoming the preferred solution when reliability is a concern in the design of a microprocessor-based system. Nevertheless, deciding how to distribute the error management across the different ...
  • Computation reuse in DNNs by exploiting input similarity 

    Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Conference report
    Restricted access - publisher's policy
    In recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs on CPUs, GPUs and accelerators has become a prolific ...
  • The dark side of DNN pruning 

    Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María; Riera Villanueva, Marc (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Conference report
    Restricted access - publisher's policy
    DNN pruning has been recently proposed as an effective technique to improve the energy-efficiency of DNN-based solutions. It is claimed that by removing unimportant or redundant connections, the pruned DNN delivers higher ...
  • Visibility rendering order: Improving energy efficiency on mobile GPUs through frame coherence 

    Lucas Casamayor, Enrique de; Marcuello Pascual, Pedro; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (2019-02-01)
    Article
    Open Access
    During real-time graphics rendering, objects are processed by the GPU in the order they are submitted by the CPU, and occluded surfaces are often processed even though they will end up not being part of the final image, ...
  • A novel register renaming technique for out-of-order processors 

    Tabani, Hamid; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Conference report
    Restricted access - publisher's policy
    Modern superscalar processors support a large number of in-flight instructions, which requires sizeable register files. Conventional register renaming techniques allocate a new storage location, i.e. physical register, for ...