Now showing items 1-20 of 21

    • A low-power, high-performance speech recognition accelerator 

      Yazdani, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019-12-01)
      Article
      Open Access
      Automatic Speech Recognition (ASR) is becoming increasingly ubiquitous, especially in the mobile segment. Fast and accurate ASR comes at high energy cost, not being affordable for the tiny power-budgeted mobile devices. ...
    • A novel register renaming technique for out-of-order processors 

      Tabani, Hamid; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Conference report
      Restricted access - publisher's policy
      Modern superscalar processors support a large number of in-flight instructions, which requires sizeable register files. Conventional register renaming techniques allocate a new storage location, i.e. physical register, for ...
    • An ultra low-power hardware accelerator for acoustic scoring in speech recognition 

      Tabani, Hamid; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Conference report
      Restricted access - publisher's policy
      Accurate, real-time Automatic Speech Recognition (ASR) comes at a high energy cost, so accuracy has often to be sacrificed in order to fit the strict power constraints of mobile systems. However, accuracy is extremely ...
    • An ultra low-power hardware accelerator for automatic speech recognition 

      Yazdani Aminabadi, Reza; Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (IEEE Press, 2016)
      Conference report
      Open Access
      Automatic Speech Recognition (ASR) is becoming increasingly ubiquitous, especially in the mobile segment. Fast and accurate ASR comes at a high energy cost which is not affordable for the tiny power budget of mobile devices. ...
    • CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference 

      Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2019-09-01)
      Article
      Open Access
      Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations ...
    • Computation reuse in DNNs by exploiting input similarity 

      Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Conference report
      Restricted access - publisher's policy
      In recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs on CPUs, GPUs and accelerators has become a prolific ...
    • Demystifying power and performance bottlenecks in autonomous driving systems 

      Exenberger Becker, Pedro Henrique; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      Autonomous Vehicles (AVs) have the potential to radically change the automotive industry. However, computing solutions for AVs have to meet severe performance and power constraints to guarantee a safe driving experience. ...
    • Design and evaluation of an ultra low-power human-quality speech recognition system 

      Pinto Rivero, Daniel; Arnau Montañés, José María; González Colás, Antonio María (2020-11)
      Article
      Open Access
      Automatic Speech Recognition (ASR) has experienced a dramatic evolution since pioneer development of Bell Lab’s single-digit recognizer more than 50 years ago. Current ASR systems have taken advantage of the tremendous ...
    • E-PUR: an energy-efficient processing unit for recurrent neural networks 

      Silfa Feliz, Franyell Antonio; Dot, Gem; Arnau Montañés, José María; González Colás, Antonio María (2018)
      Conference report
      Restricted access - publisher's policy
      Recurrent Neural Networks (RNNs) are a key technology for emerging applications such as automatic speech recognition, machine translation or image description. Long Short Term Memory (LSTM) networks are the most successful ...
    • Eliminating redundant fragment shader executions on a mobile GPU via hardware memoization 

      Arnau Montañés, José María; Parcerisa Bundó, Joan Manuel; Xekalakis, Polychronis (2014)
      Conference report
      Restricted access - publisher's policy
      Redundancy is at the heart of graphical applications. In fact, generating an animation typically involves the succession of extremely similar images. In terms of rendering these images, this behavior translates into the ...
    • Energy-efficient mobile GPU systems 

      Arnau Montañés, José María (Universitat Politècnica de Catalunya, 2015-04-24)
      Doctoral thesis
      Open Access
      The design of mobile GPUs is all about saving energy. Smartphones and tablets are battery-operated and thus any type of rendering needs to use as little energy as possible. Furthermore, smartphones do not include sophisticated ...
    • LAWS: Locality-AWare Scheme for automatic speech recognition 

      Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2020-08-01)
      Article
      Open Access
      Automatic Speech Recognition (ASR) systems are changing the way people interact with different applications on mobile devices. Fulfilling such user-interactivity requires not only a highly accurate, large-vocabulary ...
    • Leveraging run-time feedback for efficient ASR acceleration 

      Yazdani, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference report
      Open Access
      In this work, we propose Locality-AWare-Scheme (LAWS) for an Automatic Speech Recognition (ASR) accelerator in order to significantly reduce its energy consumption and memory requirements, by leveraging the locality among ...
    • Low-power automatic speech recognition through a mobile GPU and a Viterbi accelerator 

      Yazdani Aminabadi, Reza; Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (2017-04-12)
      Article
      Open Access
      Automatic speech recognition (ASR) has become a core technology for mobile devices. Delivering real-time and accurate ASR has a huge computational cost, which is challenging to achieve in tightly energy-constrained platforms ...
    • Neuron-level fuzzy memoization in RNNs 

      Silfa Feliz, Franyell Antonio; Dot Artigas, Gem; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2019)
      Conference report
      Open Access
      Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the ...
    • Parallel frame rendering: trading responsiveness for energy on a mobile GPU 

      Arnau Montañés, José María; Parcerisa Bundó, Joan Manuel; Xekalakis, Polychronis (2013)
      Conference report
      Restricted access - publisher's policy
      Perhaps one of the most important design aspects for smartphones and tablets is improving their energy efficiency. Unfortunately, rich media content applications typically put significant pressure to the GPU's memory ...
    • Performance analysis and optimization of automatic speech recognition 

      Tabani, Hamid; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (2018-10-01)
      Article
      Open Access
      Fast and accurate Automatic Speech Recognition (ASR) is emerging as a key application for mobile devices. Delivering ASR on such devices is challenging due to the compute-intensive nature of the problem and the power ...
    • SCU: a GPU stream compaction unit for graph processing 

      Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2019)
      Conference report
      Restricted access - publisher's policy
      Graph processing algorithms are key in many emerging applications in areas such as machine learning and data analytics. Although the processing of large scale graphs exhibits a high degree of parallelism, the memory access ...
    • TEAPOT: a toolset for evaluating performance, power and image quality on mobile graphics systems 

      Arnau Montañés, José María; Parcerisa Bundó, Joan Manuel; Xekalakis, Polychronis (ACM, 2013)
      Conference report
      Open Access
      In this paper we present TEAPOT, a full system GPU simulator, whose goal is to allow the evaluation of the GPUs that reside in mobile phones and tablets. To this extent, it has a cycle accurate GPU model for evaluating ...
    • The dark side of DNN pruning 

      Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María; Riera Villanueva, Marc (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Conference report
      Restricted access - publisher's policy
      DNN pruning has been recently proposed as an effective technique to improve the energy-efficiency of DNN-based solutions. It is claimed that by removing unimportant or redundant connections, the pruned DNN delivers higher ...