Enviaments recents

  • δLTA:: Decoupling camera sampling from processing to avoid redundant computations in the vision pipeline 

    Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
    Text en actes de congrés
    Accés obert
    Continuous Vision (CV) systems are essential for emerging applications like Autonomous Driving (AD) and Augmented/Virtual Reality (AR/VR). A standard CV System-on-a-Chip (SoC) pipeline includes a frontend for image capture ...
  • SLIDEX: Sliding window extension for image processing 

    Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2023)
    Text en actes de congrés
    Accés obert
    With the rising need for efficient image processing in emerging applications such as Autonomous Driving (AD) and Augmented/Virtual Reality (AR/VR), many existing solutions do not meet their performance and energy efficiency ...
  • QeiHaN: An energy-efficient DNN accelerator that leverages log quantization in NDP architectures 

    Khabbazan, Bahareh; Riera Villanueva, Marc; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2023)
    Comunicació de congrés
    Accés obert
    The constant growth of DNNs makes them challenging to implement and run efficiently on traditional computecentric architectures. Some works have attempted to enhance accelerators by adding more compute units and on-chip ...
  • Boustrophedonic frames: Quasi-optimal L2 caching for textures in GPUs 

    Joseph, Diya; Aragón Alcaraz, Juan Luis; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2023)
    Text en actes de congrés
    Accés obert
    Literature is plentiful in works exploiting cache locality for GPUs. A majority of them explore replacement or bypassing policies. In this paper, however, we surpass this exploration by fabricating a formal proof for a ...
  • Exploiting kernel compression on BNNs 

    Silfa Feliz, Franyell Antonio; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2023)
    Text en actes de congrés
    Accés obert
    Binary Neural Networks (BNNs) are showing tremen-dous success on realistic image classification tasks. Notably, their accuracy is similar to the state-of-the-art accuracy obtained by full-precision models tailored to edge ...
  • K-D Bonsai: ISA-extensions to compress K-D trees for autonomous driving tasks 

    Exenberger Becker, Pedro Henrique; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
    Text en actes de congrés
    Accés obert
    Autonomous Driving (AD) systems extensively manipulate 3D point clouds for object detection and vehicle localization. Thereby, efficient processing of 3D point clouds is crucial in these systems. In this work we propose ...
  • Lightweight register file caching in collector units for GPUs 

    Abaie Shoushtary, Mojtaba; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
    Text en actes de congrés
    Accés obert
    Modern GPUs benefit from a sizable Register File (RF) to provide fine-grained thread switching. As the RF is huge and accessed frequently, it consumes a considerable share of the dynamic energy of the GPU. Designing a ...
  • Simple out of order core for GPGPUs 

    Huerta Gañán, Rodrigo; Arnau Montañés, José María; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
    Text en actes de congrés
    Accés obert
    GPU architectures have become popular for executing general-purpose programs which rely on having a large number of threads that run concurrently to hide the latency among dependent instructions. This approach has an ...
  • Sliding window support for image processing in autonomous vehicles 

    Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (2022)
    Text en actes de congrés
    Accés obert
    Camera-based autonomous driving extensively ma-nipulates images for object detection, object tracking, or camera-based localization tasks. Therefore, efficient and fast image processing is crucial in those systems. ...
  • DTexL: Decoupled raster pipeline for texture locality 

    Joseph, Diya; Aragón Alcaraz, Juan Luis; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Text en actes de congrés
    Accés obert
    Contemporary GPU architectures have multiple shader cores and a scheduler that distributes work (threads) among them, focusing on load balancing. These load balancing techniques favor thread distributions that are detrimental ...
  • A programmable accelerator for streaming automatic speech recognition on edge devices 

    Pinto Rivero, Dennis; Arnau Montañés, José María; González Colás, Antonio María (2022)
    Text en actes de congrés
    Accés obert
    Automatic Speech Recognition (ASR) is quickly becoming a mainstream technology, mainly driven by the outstanding accuracy achieved by modern systems based on machine learning. However, these systems often require billions ...
  • XFeatur: Hardware feature extraction for DNN auto-tuning 

    Sierra Acosta, Jorge; Diavastos, Andreas; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Text en actes de congrés
    Accés obert
    In this work, we extend the auto-tuning process of the state-of-the-art TVM framework with XFeatur; a tool that extracts new meaningful hardware-related features that improve the quality of the representation of the search ...

Mostra'n més