Ara es mostren els items 37-56 de 237

    • Boosting LSTM performance through dynamic precision selection 

      Silfa Feliz, Franyell Antonio; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Text en actes de congrés
      Accés obert
      The use of low numerical precision is a fundamental optimization included in modern accelerators for Deep Neural Networks (DNNs). The number of bits of the numerical representation is set to the minimum precision that is ...
    • Boosting point cloud search with a vector unit 

      Exenberger Becker, Pedro Henrique; Arnau Montañés, José María; González Colás, Antonio María (2023)
      Report de recerca
      Accés obert
      Modern robots collect and process point clouds to perform accurate registration and segmentation. The most time-consuming kernel within point cloud processing -namely neighbor search- relies on appropriate data structures, ...
    • Boosting single-thread performance in multi-core systems through fine-grain multi-threading 

      Madriles Gimeno, Carles; López Muñoz, Pedro; Codina Viñas, Josep M.; Gibert Codina, Enric; Latorre Salinas, Fernando; Martínez Vicente, Alejandro; Martinez Morais, Raul; González Colás, Antonio María (ACM Press. Association for Computing Machinery, 2009-06)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Industry has shifted towards multi-core designs as we have hit the memory and power walls. However, single thread performance remains of paramount importance since some applications have limited thread-level parallelism ...
    • Boustrophedonic frames: Quasi-optimal L2 caching for textures in GPUs 

      Joseph, Diya; Aragón Alcaraz, Juan Luis; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2023)
      Text en actes de congrés
      Accés obert
      Literature is plentiful in works exploiting cache locality for GPUs. A majority of them explore replacement or bypassing policies. In this paper, however, we surpass this exploration by fabricating a formal proof for a ...
    • CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference 

      Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2019-09-01)
      Article
      Accés obert
      Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations ...
    • Characterizing self-driving tasks in general-purpose architectures 

      Exenberger Becker, Pedro Henrique; Arnau Montañés, José María; González Colás, Antonio María (European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC), 2021-09-15)
      Capítol de llibre
      Accés obert
      Autonomous Vehicles (AVs) have the potential to radically change the automotive industry. How- ever, computing solutions for AVs have to meet severe performance constraints to guarantee a safe driving experience. Current ...
    • Chrysso: an integrated power manager for constrained many-core processors 

      Jha, Sudhanshu Shekhar; Heirman, Wim; Falcón Samper, Ayose Jesus; Carlson, Trevor E.; Van Craeynest, Kenzo; Tubella Murgadas, Jordi; González Colás, Antonio María; Eeckhout, Lieven (Association for Computing Machinery (ACM), 2015)
      Text en actes de congrés
      Accés obert
      Modern microprocessors are increasingly power-constrained as a result of slowed supply voltage scaling (end of Dennard scaling) in conjunction with the transistor density scaling (Moore's Law). Existing many-core power ...
    • Circuit propagation delay estimation through multivariate regression-based modeling under spatio-temporal variability 

      Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio (Institute of Electrical and Electronics Engineers (IEEE), 2010)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      With every process generation, the problem of variability in physical parameters and environmental conditions poses a great challenge to the design of fast and reliable circuits. Propagation delays which decide circuit ...
    • Compiler analysis for trace-level speculative multithreaded architectures 

      Molina Clemente, Carlos; González Colás, Antonio María; Tubella Murgadas, Jordi (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Text en actes de congrés
      Accés obert
      Trace-level speculative multithreaded processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating ...
    • Compiler directed early register release 

      Jones, Timothy M.; O’Boyle, Michael F.P.; Abella Ferrer, Jaume; González Colás, Antonio María; Ergin, Oguz (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Text en actes de congrés
      Accés obert
      This paper presents a novel compiler directed technique to reduce the register pressure and power of the register file by releasing registers early. The compiler identifies registers that mil only be read once and renames ...
    • Computation reuse in DNNs by exploiting input similarity 

      Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      In recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs on CPUs, GPUs and accelerators has become a prolific ...
    • Control speculation for energy-efficient next-generation superscalar processors 

      Aragón, Juan Luis; González González, José; González Colás, Antonio María (2006-03)
      Article
      Accés obert
      Conventional front-end designs attempt to maximize the number of "in-flight" instructions in the pipeline. However, branch mispredictions cause the processor to fetch useless instructions that are eventually squashed, ...
    • Control speculation in multithreaded processors through dynamic loop detection 

      Tubella Murgadas, Jordi; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1998)
      Text en actes de congrés
      Accés obert
      This paper presents a mechanism to dynamically detect the loops that are executed in a program. This technique detects the beginning and the termination of the iterations and executions of the loops without compiler/user ...
    • Control-flow independence reuse via dynamic vectorization 

      Pajuelo González, Manuel Alejandro; González Colás, Antonio María; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Text en actes de congrés
      Accés obert
      Current processors exploit out-of-order execution and branch prediction to improve instruction level parallelism. When a branch prediction is wrong, processors flush the pipeline and squash all the speculative work. However, ...
    • Control-flow speculation through value prediction 

      González, José; González Colás, Antonio María (2001-12)
      Article
      Accés restringit per política de l'editorial
      In this paper, we introduce a new branch predictor that predicts the outcome of branches by predicting the value of their inputs and performing an early computation of their results according to the predicted values. The ...
    • Control-flow speculation through value prediction for superscalar processors 

      González González, José; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1999)
      Text en actes de congrés
      Accés obert
      In this paper, we introduce a new branch predictor that predicts the outcomes of branches by predicting the value of their inputs and performing an early computation of their results according to the predicted values. The ...
    • CREW: Computation reuse and efficient weight storage for hardware-accelerated MLPs and RNNs 

      Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2022-08-01)
      Article
      Accés obert
      Deep Neural Networks (DNNs) have achieved tremendous success for cognitive applications. The core operation in a DNN is the dot product between quantized inputs and weights. Prior works exploit the weight/input repetition ...
    • Cross-layer system reliability assessment framework for hardware faults 

      Vallero, Alessandro; Savino, Alessandro; Politano, Gianfranco; Di Carlo, Stefano; Chatzidimitriou, Athanansios; Tselonis, Sotiris; Kaliorakis, Manolis; Gizipoulos, Dimitris; Riera Villanueva, Marc; Canal Corretger, Ramon; González Colás, Antonio María; Kooli, Maha; Bosio, Alberto; Di Natale, Giorgio (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Text en actes de congrés
      Accés obert
      System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes of hardware faults. When not all system abstraction ...
    • Data speculative multithreaded architecture 

      Marcuello, Pedro; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1998)
      Text en actes de congrés
      Accés obert
      We present a novel processor microarchitecture that relieves three of the most important bottlenecks of superscalar processors: the serialization imposed by true dependences, the relatively small window size and the ...
    • DDGacc: boosting dynamic DDG-based binary optimizations through specialized hardware support 

      Pavlou, Demos; Gibert Codina, Enric; Latorre, Fernando; González Colás, Antonio María (2012)
      Text en actes de congrés
      Accés restringit per acord de confidencialitat
      Dynamic Binary Translators (DBT) and Dynamic Binary Opti- mization (DBO) by software are used widely for several reasons including performance, design simplification and virtualization. However, the software layer in ...