Exploració per autor "González Colás, Antonio María"

Boosting LSTM performance through dynamic precision selection

Silfa Feliz, Franyell Antonio; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2020)
Text en actes de congrés
Accés obert

The use of low numerical precision is a fundamental optimization included in modern accelerators for Deep Neural Networks (DNNs). The number of bits of the numerical representation is set to the minimum precision that is ...

Boosting point cloud search with a vector unit

Exenberger Becker, Pedro Henrique; Arnau Montañés, José María; González Colás, Antonio María (2023)
Report de recerca
Accés obert

Modern robots collect and process point clouds to perform accurate registration and segmentation. The most time-consuming kernel within point cloud processing -namely neighbor search- relies on appropriate data structures, ...

Boosting single-thread performance in multi-core systems through fine-grain multi-threading

Madriles Gimeno, Carles; López Muñoz, Pedro; Codina Viñas, Josep M.; Gibert Codina, Enric; Latorre Salinas, Fernando; Martínez Vicente, Alejandro; Martinez Morais, Raul; González Colás, Antonio María (ACM Press. Association for Computing Machinery, 2009-06)
Text en actes de congrés
Accés restringit per política de l'editorial

Industry has shifted towards multi-core designs as we have hit the memory and power walls. However, single thread performance remains of paramount importance since some applications have limited thread-level parallelism ...

Boustrophedonic frames: Quasi-optimal L2 caching for textures in GPUs

Joseph, Diya; Aragón Alcaraz, Juan Luis; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2023)
Text en actes de congrés
Accés obert

Literature is plentiful in works exploiting cache locality for GPUs. A majority of them explore replacement or bypassing policies. In this paper, however, we surpass this exploration by fabricating a formal proof for a ...

CGPA: Coarse-Grained Pruning of Activations for Energy-Efficient RNN Inference

Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2019-09-01)
Article
Accés obert

Recurrent neural networks (RNNs) perform element-wise multiplications across the activations of gates. We show that a significant percentage of activations are saturated and propose coarse-grained pruning of activations ...

Characterizing self-driving tasks in general-purpose architectures

Exenberger Becker, Pedro Henrique; Arnau Montañés, José María; González Colás, Antonio María (European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC), 2021-09-15)
Capítol de llibre
Accés obert

Autonomous Vehicles (AVs) have the potential to radically change the automotive industry. How- ever, computing solutions for AVs have to meet severe performance constraints to guarantee a safe driving experience. Current ...

Chrysso: an integrated power manager for constrained many-core processors

Jha, Sudhanshu Shekhar; Heirman, Wim; Falcón Samper, Ayose Jesus; Carlson, Trevor E.; Van Craeynest, Kenzo; Tubella Murgadas, Jordi; González Colás, Antonio María; Eeckhout, Lieven (Association for Computing Machinery (ACM), 2015)
Text en actes de congrés
Accés obert

Modern microprocessors are increasingly power-constrained as a result of slowed supply voltage scaling (end of Dennard scaling) in conjunction with the transistor density scaling (Moore's Law). Existing many-core power ...

Circuit propagation delay estimation through multivariate regression-based modeling under spatio-temporal variability

Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio (Institute of Electrical and Electronics Engineers (IEEE), 2010)
Text en actes de congrés
Accés restringit per política de l'editorial

With every process generation, the problem of variability in physical parameters and environmental conditions poses a great challenge to the design of fast and reliable circuits. Propagation delays which decide circuit ...

Compiler analysis for trace-level speculative multithreaded architectures

Molina Clemente, Carlos; González Colás, Antonio María; Tubella Murgadas, Jordi (Institute of Electrical and Electronics Engineers (IEEE), 2005)
Text en actes de congrés
Accés obert

Trace-level speculative multithreaded processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating ...

Compiler directed early register release

Jones, Timothy M.; O’Boyle, Michael F.P.; Abella Ferrer, Jaume; González Colás, Antonio María; Ergin, Oguz (Institute of Electrical and Electronics Engineers (IEEE), 2005)
Text en actes de congrés
Accés obert

This paper presents a novel compiler directed technique to reduce the register pressure and power of the register file by releasing registers early. The compiler identifies registers that mil only be read once and renames ...

Computation reuse in DNNs by exploiting input similarity

Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2018)
Text en actes de congrés
Accés restringit per política de l'editorial

In recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs on CPUs, GPUs and accelerators has become a prolific ...

Control speculation for energy-efficient next-generation superscalar processors

Aragón, Juan Luis; González González, José; González Colás, Antonio María (2006-03)
Article
Accés obert

Conventional front-end designs attempt to maximize the number of "in-flight" instructions in the pipeline. However, branch mispredictions cause the processor to fetch useless instructions that are eventually squashed, ...

Control speculation in multithreaded processors through dynamic loop detection

Tubella Murgadas, Jordi; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1998)
Text en actes de congrés
Accés obert

This paper presents a mechanism to dynamically detect the loops that are executed in a program. This technique detects the beginning and the termination of the iterations and executions of the loops without compiler/user ...

Control-flow independence reuse via dynamic vectorization

Pajuelo González, Manuel Alejandro; González Colás, Antonio María; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2005)
Text en actes de congrés
Accés obert

Current processors exploit out-of-order execution and branch prediction to improve instruction level parallelism. When a branch prediction is wrong, processors flush the pipeline and squash all the speculative work. However, ...

Control-flow speculation through value prediction

González, José; González Colás, Antonio María (2001-12)
Article
Accés restringit per política de l'editorial

In this paper, we introduce a new branch predictor that predicts the outcome of branches by predicting the value of their inputs and performing an early computation of their results according to the predicted values. The ...

Control-flow speculation through value prediction for superscalar processors

González González, José; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1999)
Text en actes de congrés
Accés obert

In this paper, we introduce a new branch predictor that predicts the outcomes of branches by predicting the value of their inputs and performing an early computation of their results according to the predicted values. The ...

CREW: Computation reuse and efficient weight storage for hardware-accelerated MLPs and RNNs

Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María (2022-08-01)
Article
Accés obert

Deep Neural Networks (DNNs) have achieved tremendous success for cognitive applications. The core operation in a DNN is the dot product between quantized inputs and weights. Prior works exploit the weight/input repetition ...

Cross-layer system reliability assessment framework for hardware faults

Vallero, Alessandro; Savino, Alessandro; Politano, Gianfranco; Di Carlo, Stefano; Chatzidimitriou, Athanansios; Tselonis, Sotiris; Kaliorakis, Manolis; Gizipoulos, Dimitris; Riera Villanueva, Marc; Canal Corretger, Ramon; González Colás, Antonio María; Kooli, Maha; Bosio, Alberto; Di Natale, Giorgio (Institute of Electrical and Electronics Engineers (IEEE), 2016)
Text en actes de congrés
Accés obert

System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes of hardware faults. When not all system abstraction ...

Data speculative multithreaded architecture

Marcuello, Pedro; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1998)
Text en actes de congrés
Accés obert

We present a novel processor microarchitecture that relieves three of the most important bottlenecks of superscalar processors: the serialization imposed by true dependences, the relatively small window size and the ...

DDGacc: boosting dynamic DDG-based binary optimizations through specialized hardware support

Pavlou, Demos; Gibert Codina, Enric; Latorre, Fernando; González Colás, Antonio María (2012)
Text en actes de congrés
Accés restringit per acord de confidencialitat

Dynamic Binary Translators (DBT) and Dynamic Binary Opti- mization (DBO) by software are used widely for several reasons including performance, design simplification and virtualization. However, the software layer in ...