Ara es mostren els items 1-12 de 68

    • SHARP: An adaptable, energy-efficient accelerator for recurrent neural networks 

      Yazdani Aminabadi, Reza; Ruwase, Olatunji; Zhang, Minjia; He, Yuxiong; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023-01-24)
      Article
      Accés obert
      The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, ...
    • A survey of near-data processing architectures for neural networks 

      Hassanpour, Mehdi; Riera Villanueva, Marc; González Colás, Antonio María (2022-01-17)
      Article
      Accés obert
      Data-intensive workloads and applications, such as machine learning (ML), are fundamentally limited by traditional computing systems based on the von-Neumann architecture. As data movement operations and energy consumption ...
    • LOCATOR: Low-power ORB accelerator for autonomous cars 

      Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (Elsevier, 2023-04)
      Article
      Accés obert
      Simultaneous Localization And Mapping (SLAM) is crucial for autonomous navigation. ORB-SLAM is a state-of-the-art Visual SLAM system based on cameras used for self-driving cars. In this paper, we propose a high-performance, ...
    • Triangle Dropping: An occluded-geometry predictor for energy-efficient mobile GPUs 

      Corbalán Navarro, David; Aragon Alcaraz, Juan Luis; Anglada Sánchez, Martí; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Association for Computing Machinery (ACM), 2022-09)
      Article
      Accés obert
      This article proposes a novel micro-architecture approach for mobile GPUs aimed at early removing the occluded geometry in a scene by leveraging frame-to-frame coherence, thus reducing the overall energy consumption. Mobile ...
    • Irregular accesses reorder unit: improving GPGPU memory coalescing for graph-based workloads 

      Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (2023-01)
      Article
      Accés obert
      GPGPU architectures have become the dominant platform for massively parallel workloads, delivering high performance and energy efficiency for popular applications such as machine learning, computer vision or self-driving ...
    • E-BATCH: Energy-efficient and high-throughput RNN batching 

      Silfa Feliz, Franyell Antonio; Arnau Montañés, José María; González Colás, Antonio María (2022-03)
      Article
      Accés obert
      Recurrent Neural Network (RNN) inference exhibits low hardware utilization due to the strict data dependencies across time-steps. Batching multiple requests can increase throughput. However, RNN batching requires a large ...
    • CREW: Computation reuse and efficient weight storage for hardware-accelerated MLPs and RNNs 

      Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2022-08-01)
      Article
      Accés obert
      Deep Neural Networks (DNNs) have achieved tremendous success for cognitive applications. The core operation in a DNN is the dot product between quantized inputs and weights. Prior works exploit the weight/input repetition ...
    • Vector extensions in COTS processors to increase guaranteed performance in real-time systems 

      Pujol Torramorell, Roger; Jorba Jorba, Josep; Tabani, Hamid; Kosmidis, Leonidas; Mezzetti, Enrico; Abella Ferrer, Jaume; Cazorla Almeida, Francisco Javier (2023-03)
      Article
      Accés obert
      The need for increased application performance in high-integrity systems like those in avionics is on the rise as software continues to implement more complex functionalities. The prevalent computing solution for future ...
    • Dynamic sampling rate: harnessing frame coherence in graphics applications for energy-efficient GPUs 

      Anglada Sánchez, Martí; de Lucas Casamayor, Enrique; Parcerisa Bundó, Joan Manuel; Aragón Alcaraz, Juan Luis; González Colás, Antonio María (Springer Nature, 2022)
      Article
      Accés obert
      In real-time rendering, a 3D scene is modelled with meshes of triangles that the GPU projects to the screen. They are discretized by sampling each triangle at regular space intervals to generate fragments which are then ...
    • Fast and accurate SER estimation for large combinational blocks in early stages of the design 

      Anglada Sánchez, Martí; Canal Corretger, Ramon; Aragón Alcaraz, Juan Luis; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2021-07-01)
      Article
      Accés obert
      Soft Error Rate (SER) estimation is an important challenge for integrated circuits because of the increased vulnerability brought by technology scaling. This paper presents a methodology to estimate in early stages of the ...
    • Energy-efficient stream compaction through filtering and coalescing accesses in GPGPU memory partitions 

      Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022-07-01)
      Article
      Accés obert
      Graph-based applications are essential in emerging domains such as data analytics or machine learning. Data gathering in a knowledge-based society requires great data processing efficiency. High-throughput GPGPU architectures ...
    • DNN pruning with principal component analysis and connection importance estimation 

      Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2022-01)
      Article
      Accés obert
      DNN pruning reduces memory footprint and computational work of DNN-based solutions to improve performance and energy-efficiency. An effective pruning scheme should be able to systematically remove connections and/or neurons ...