Ara es mostren els items 21-40 de 40

    • Exploiting kernel compression on BNNs 

      Silfa Feliz, Franyell Antonio; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2023)
      Text en actes de congrés
      Accés obert
      Binary Neural Networks (BNNs) are showing tremen-dous success on realistic image classification tasks. Notably, their accuracy is similar to the state-of-the-art accuracy obtained by full-precision models tailored to edge ...
    • Irregular accesses reorder unit: improving GPGPU memory coalescing for graph-based workloads 

      Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (2023-01)
      Article
      Accés obert
      GPGPU architectures have become the dominant platform for massively parallel workloads, delivering high performance and energy efficiency for popular applications such as machine learning, computer vision or self-driving ...
    • K-D Bonsai: ISA-extensions to compress K-D trees for autonomous driving tasks 

      Exenberger Becker, Pedro Henrique; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
      Text en actes de congrés
      Accés obert
      Autonomous Driving (AD) systems extensively manipulate 3D point clouds for object detection and vehicle localization. Thereby, efficient processing of 3D point clouds is crucial in these systems. In this work we propose ...
    • LAWS: Locality-AWare Scheme for automatic speech recognition 

      Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2020-08-01)
      Article
      Accés obert
      Automatic Speech Recognition (ASR) systems are changing the way people interact with different applications on mobile devices. Fulfilling such user-interactivity requires not only a highly accurate, large-vocabulary ...
    • Leveraging run-time feedback for efficient ASR acceleration 

      Yazdani, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Text en actes de congrés
      Accés obert
      In this work, we propose Locality-AWare-Scheme (LAWS) for an Automatic Speech Recognition (ASR) accelerator in order to significantly reduce its energy consumption and memory requirements, by leveraging the locality among ...
    • Lightweight register file caching in collector units for GPUs 

      Abaie Shoushtary, Mojtaba; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
      Text en actes de congrés
      Accés obert
      Modern GPUs benefit from a sizable Register File (RF) to provide fine-grained thread switching. As the RF is huge and accessed frequently, it consumes a considerable share of the dynamic energy of the GPU. Designing a ...
    • LOCATOR: Low-power ORB accelerator for autonomous cars 

      Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (Elsevier, 2023-04)
      Article
      Accés obert
      Simultaneous Localization And Mapping (SLAM) is crucial for autonomous navigation. ORB-SLAM is a state-of-the-art Visual SLAM system based on cameras used for self-driving cars. In this paper, we propose a high-performance, ...
    • Low-power automatic speech recognition through a mobile GPU and a Viterbi accelerator 

      Yazdani Aminabadi, Reza; Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (2017-04-12)
      Article
      Accés obert
      Automatic speech recognition (ASR) has become a core technology for mobile devices. Delivering real-time and accurate ASR has a huge computational cost, which is challenging to achieve in tightly energy-constrained platforms ...
    • Neuron-level fuzzy memoization in RNNs 

      Silfa Feliz, Franyell Antonio; Dot Artigas, Gem; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2019)
      Text en actes de congrés
      Accés obert
      Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the ...
    • Parallel frame rendering: trading responsiveness for energy on a mobile GPU 

      Arnau Montañés, José María; Parcerisa Bundó, Joan Manuel; Xekalakis, Polychronis (2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Perhaps one of the most important design aspects for smartphones and tablets is improving their energy efficiency. Unfortunately, rich media content applications typically put significant pressure to the GPU's memory ...
    • Performance analysis and optimization of automatic speech recognition 

      Tabani, Hamid; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (2018-10-01)
      Article
      Accés obert
      Fast and accurate Automatic Speech Recognition (ASR) is emerging as a key application for mobile devices. Delivering ASR on such devices is challenging due to the compute-intensive nature of the problem and the power ...
    • SCU: a GPU stream compaction unit for graph processing 

      Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2019)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Graph processing algorithms are key in many emerging applications in areas such as machine learning and data analytics. Although the processing of large scale graphs exhibits a high degree of parallelism, the memory access ...
    • SHARP: An adaptable, energy-efficient accelerator for recurrent neural networks 

      Yazdani Aminabadi, Reza; Ruwase, Olatunji; Zhang, Minjia; He, Yuxiong; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023-01-24)
      Article
      Accés obert
      The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, ...
    • Simple out of order core for GPGPUs 

      Huerta Gañán, Rodrigo; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
      Text en actes de congrés
      Accés obert
      GPU architectures have become popular for executing general-purpose programs which rely on having a large number of threads that run concurrently to hide the latency among dependent instructions. This approach has an ...
    • SLIDEX: Sliding window extension for image processing 

      Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2023)
      Text en actes de congrés
      Accés obert
      With the rising need for efficient image processing in emerging applications such as Autonomous Driving (AD) and Augmented/Virtual Reality (AR/VR), many existing solutions do not meet their performance and energy efficiency ...
    • Sliding window support for image processing in autonomous vehicles 

      Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (2022)
      Text en actes de congrés
      Accés obert
      Camera-based autonomous driving extensively ma-nipulates images for object detection, object tracking, or camera-based localization tasks. Therefore, efficient and fast image processing is crucial in those systems. ...
    • TEAPOT: a toolset for evaluating performance, power and image quality on mobile graphics systems 

      Arnau Montañés, José María; Parcerisa Bundó, Joan Manuel; Xekalakis, Polychronis (ACM, 2013)
      Text en actes de congrés
      Accés obert
      In this paper we present TEAPOT, a full system GPU simulator, whose goal is to allow the evaluation of the GPUs that reside in mobile phones and tablets. To this extent, it has a cycle accurate GPU model for evaluating ...
    • The dark side of DNN pruning 

      Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María; Riera Villanueva, Marc (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      DNN pruning has been recently proposed as an effective technique to improve the energy-efficiency of DNN-based solutions. It is claimed that by removing unimportant or redundant connections, the pruned DNN delivers higher ...
    • UNFOLD: a memory-efficient speech recognizer using on-the-fly WFST composition 

      Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2017)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Accurate, real-time Automatic Speech Recognition (ASR) requires huge memory storage and computational power. The main bottleneck in state-of-the-art ASR systems is the Viterbi search on a Weighted Finite State Transducer ...
    • δLTA:: Decoupling camera sampling from processing to avoid redundant computations in the vision pipeline 

      Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
      Text en actes de congrés
      Accés obert
      Continuous Vision (CV) systems are essential for emerging applications like Autonomous Driving (AD) and Augmented/Virtual Reality (AR/VR). A standard CV System-on-a-Chip (SoC) pipeline includes a frontend for image capture ...