Recent Submissions

  • Sliding window support for image processing in autonomous vehicles 

    Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (2022)
    Conference report
    Open Access
    Camera-based autonomous driving extensively ma-nipulates images for object detection, object tracking, or camera-based localization tasks. Therefore, efficient and fast image processing is crucial in those systems. ...
  • DTexL: Decoupled raster pipeline for texture locality 

    Joseph, Diya; Aragón Alcaraz, Juan Luis; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Conference report
    Open Access
    Contemporary GPU architectures have multiple shader cores and a scheduler that distributes work (threads) among them, focusing on load balancing. These load balancing techniques favor thread distributions that are detrimental ...
  • A programmable accelerator for streaming automatic speech recognition on edge devices 

    Pinto Rivero, Daniel; Arnau Montañés, José María; González Colás, Antonio María (2022)
    Conference report
    Open Access
    Automatic Speech Recognition (ASR) is quickly becoming a mainstream technology, mainly driven by the outstanding accuracy achieved by modern systems based on machine learning. However, these systems often require billions ...
  • XFeatur: Hardware feature extraction for DNN auto-tuning 

    Sierra Acosta, Jorge; Diavastos, Andreas; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Conference report
    Open Access
    In this work, we extend the auto-tuning process of the state-of-the-art TVM framework with XFeatur; a tool that extracts new meaningful hardware-related features that improve the quality of the representation of the search ...
  • MEGsim: A Novel methodology for efficient simulation of graphics workloads in GPUs 

    Ortiz Escribano, Jorge; Corbalán Navarro, David; Aragón Alcaraz, Juan Luis; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Conference report
    Open Access
    An important drawback of cycle-accurate microarchitectural simulators is that they are several orders of magnitude slower than the system they model. This becomes an important issue when simulations have to be repeated ...
  • DTM-NUCA: dynamic texture mapping-NUCA for energy-efficient graphics rendering 

    Corbalán Navarro, David; Aragón, Juan Luis; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Conference report
    Open Access
    Modern mobile GPUs integrate an increasing number of shader cores to speedup the execution of graphics workloads. Each core integrates a private Texture Cache to apply texturing effects on objects, which is backed-up by a ...
  • TCOR: a tile cache with optimal replacement 

    Joseph, Diya; Aragón, Juan Luis; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022)
    Conference report
    Open Access
    Cache Replacement Policies are known to have an important impact on hit rates. The OPT replacement policy [27] has been formally proven as optimal for minimizing misses. Due to its need to look far ahead for future memory ...
  • Improving the energy efficiency of the graphics pipeline by reducing overshading 

    Corbalán Navarro, David; Aragón, Juan Luis; Anglada Sánchez, Martí; de Lucas Casamayor, Enrique; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (2021)
    Conference report
    Open Access
    The most common task of GPUs is to render images in real time. When rendering a 3D scene, a key step is determining which parts of every object are visible in the final image. There are different approaches to solve the ...
  • A low-power hardware accelerator for ORB feature extraction in self-driving cars 

    Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2021)
    Conference report
    Open Access
    Simultaneous Localization And Mapping (SLAM) is a key component for autonomous navigation. SLAM consists of building and creating a map of an unknown environment while keeping track of the exploring agent's location within ...
  • Boosting LSTM performance through dynamic precision selection 

    Silfa Feliz, Franyell Antonio; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2020)
    Conference report
    Open Access
    The use of low numerical precision is a fundamental optimization included in modern accelerators for Deep Neural Networks (DNNs). The number of bits of the numerical representation is set to the minimum precision that is ...
  • Demystifying power and performance bottlenecks in autonomous driving systems 

    Exenberger Becker, Pedro Henrique; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2020)
    Conference report
    Open Access
    Autonomous Vehicles (AVs) have the potential to radically change the automotive industry. However, computing solutions for AVs have to meet severe performance and power constraints to guarantee a safe driving experience. ...
  • DRAM errors in the field: a statistical approach 

    Živanovič, Darko; Esmaili Dokht, Pouya; Moré, Sergi; Bartolomé, Javier; Carpenter, Paul Matthew; Radojković, Petar; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2019)
    Conference report
    Open Access
    This paper summarizes our two-year study of corrected and uncor-rected errors on the MareNostrum 3 supercomputer, covering 2000 billion MB-hours of DRAM in the field. The study analyzes 4.5 million corrected and 71 uncorrected ...

View more