Ara es mostren els items 127-146 de 237

    • Last Bank: dealing with address reuse in non-uniform cache architecture for CMPs 

      Lira Rueda, Javier; Molina Clemente, Carlos; González Colás, Antonio María (2009-01-16)
      Report de recerca
      Accés obert
      In response to the constant increase in wire delays, Non-Uniform Cache Architecture (NUCA) has been introduced as an effective memory model for dealing with growing memory latencies. This architecture divides a large memory ...
    • Late allocation and early release of physical registers 

      Monreal Arnal, Teresa; Viñals Yufera, Víctor; González González, José; González Colás, Antonio María; Valero Cortés, Mateo (2004-10)
      Article
      Accés obert
      The register file is one of the critical components of current processors in terms of access time and power consumption. Among other things, the potential to exploit instruction-level parallelism is closely related to the ...
    • LAWS: Locality-AWare Scheme for automatic speech recognition 

      Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2020-08-01)
      Article
      Accés obert
      Automatic Speech Recognition (ASR) systems are changing the way people interact with different applications on mobile devices. Fulfilling such user-interactivity requires not only a highly accurate, large-vocabulary ...
    • Leveraging register windows to reduce physical registers to the bare minimum 

      Quiñones, Eduardo; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (2010-12)
      Article
      Accés obert
      Register window is an architectural technique that reduces memory operations required to save and restore registers across procedure calls. Its effectiveness depends on the size of the register file. Such register requirements ...
    • Leveraging run-time feedback for efficient ASR acceleration 

      Yazdani, Reza; Arnau Montañés, José María; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Text en actes de congrés
      Accés obert
      In this work, we propose Locality-AWare-Scheme (LAWS) for an Automatic Speech Recognition (ASR) accelerator in order to significantly reduce its energy consumption and memory requirements, by leveraging the locality among ...
    • Lifetime-sensitive modulo scheduling in a production environment 

      Llosa Espuny, José Francisco; Ayguadé Parra, Eduard; González Colás, Antonio María; Valero Cortés, Mateo; Eckhardt, Jason (2001-03)
      Article
      Accés obert
      This paper presents a novel software pipelining approach, which is called Swing Modulo Scheduling (SMS). It generates schedules that are near optimal in terms of initiation interval, register requirements, and stage count. ...
    • Lightweight register file caching in collector units for GPUs 

      Abaie Shoushtary, Mojtaba; Arnau Montañés, José María; Tubella Murgadas, Jordi; González Colás, Antonio María (Association for Computing Machinery (ACM), 2023)
      Text en actes de congrés
      Accés obert
      Modern GPUs benefit from a sizable Register File (RF) to provide fine-grained thread switching. As the RF is huge and accessed frequently, it consumes a considerable share of the dynamic energy of the GPU. Designing a ...
    • Local scheduling techniques for memory coherence in a clustered VLIW processor with a distributed data cache 

      Gibert Codina, Enric; Sánchez, Jesús; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2003)
      Text en actes de congrés
      Accés obert
      Clustering is a common technique to deal with wire delays. Fully-distributed architectures, where the register file, the functional units and the cache memory are partitioned, are particularly effective to deal with these ...
    • LOCATOR: Low-power ORB accelerator for autonomous cars 

      Taranco Serna, Raúl; Arnau Montañés, José María; González Colás, Antonio María (Elsevier, 2023-04)
      Article
      Accés obert
      Simultaneous Localization And Mapping (SLAM) is crucial for autonomous navigation. ORB-SLAM is a state-of-the-art Visual SLAM system based on cameras used for self-driving cars. In this paper, we propose a high-performance, ...
    • Low Vccmin fault-tolerant cache with highly predictable performance 

      Abella Ferrer, Jaume; Carretero Casado, Javier Sebastián; Chaparro Valero, Pedro Alonso; Vera Rivera, Francisco Javier; González Colás, Antonio María (IEEE Press. Institute of Electrical and Electronics Engineers, 2009)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Transistors per area unit double in every new technology node. However, the electric field density and power demand grow if Vcc is not scaled. Therefore, Vcc must be scaled in pace with new technology nodes to prevent ...
    • Low-complexity distributed issue queue 

      Abella Ferrer, Jaume; González Colás, Antonio María (IEEE Computer Society, 2004)
      Text en actes de congrés
      Accés obert
      As technology evolves, power density significantly increases and cooling systems become more complex and expensive. The issue logic is one of the processor hotspots and, at the same time, its latency is crucial for the ...
    • Low-power automatic speech recognition through a mobile GPU and a Viterbi accelerator 

      Yazdani Aminabadi, Reza; Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María (2017-04-12)
      Article
      Accés obert
      Automatic speech recognition (ASR) has become a core technology for mobile devices. Delivering real-time and accurate ASR has a huge computational cost, which is challenging to achieve in tightly energy-constrained platforms ...
    • LRU-PEA: A smart replacement policy for non-uniform cache architectures on chip multiprocessors 

      Lira Rueda, Javier; Molina Clemente, Carlos; González Colás, Antonio María (2009-05-14)
      Report de recerca
      Accés obert
      The increasing speed-gap between processor and memory and the limited memory bandwidth make last-level cache performance crucial for CMP architectures. Non Uniform Cache Architectures (NUCA) has been introduced to deal ...
    • LRU-PEA: A smart replacement policy for non-uniform cache architectures on chip multiprocessors 

      Lira Rueda, Javier; Molina Clemente, Carlos; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2009)
      Text en actes de congrés
      Accés obert
      The increasing speed-gap between processor and memory and the limited memory bandwidth make last-level cache performance crucial for CMP architectures. non uniform cache architectures (NUCA) have been introduced to deal ...
    • MASkIt: soft error rate estimation for combinatorial circuits 

      Anglada Sánchez, Martí; Canal Corretger, Ramon; Aragon, Juan Luis; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Integrated circuits are getting increasingly vulnerable to soft errors; as a consequence, soft error rate (SER) estimation has become an important and very challenging goal. In this work, a novel approach for SER estimation ...
    • MEGsim: A Novel methodology for efficient simulation of graphics workloads in GPUs 

      Ortiz Escribano, Jorge; Corbalán Navarro, David; Aragón Alcaraz, Juan Luis; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2022)
      Text en actes de congrés
      Accés obert
      An important drawback of cycle-accurate microarchitectural simulators is that they are several orders of magnitude slower than the system they model. This becomes an important issue when simulations have to be repeated ...
    • Memory bank predictors 

      Bieschewski, Stefan; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Text en actes de congrés
      Accés obert
      Cache memories are commonly implemented through multiple memory banks to improve bandwidth and latency. The early knowledge of the data cache bank that an instruction will access can help to improve the performance in ...
    • MeRLiN: Exploiting dynamic instruction behavior for fast and accurate microarchitecture level reliability assessment 

      Kaliorakis, Manolis; Gizopoulos, Dimitris; Canal Corretger, Ramon; González Colás, Antonio María (Association for Computing Machinery (ACM), 2017)
      Text en actes de congrés
      Accés obert
      Early reliability assessment of hardware structures using microarchitecture level simulators can effectively guide major error protection decisions in microprocessor design. Statistical fault injection on microarchitectural ...
    • Mitosis: A speculative multithreaded processor based on pre-computation slices 

      Madriles Gimeno, Carles; García Quiñones, Carlos; Sánchez, Jesús; Marcuello, Pedro; González Colás, Antonio María; Tullsen, Dean; Wang, Hong; Shen, John P. (2008-07)
      Article
      Accés obert
      This paper presents the Mitosis framework, which is a combined hardware-software approach to speculative multithreading, even in the presence of frequent dependences among threads. Speculative multithreading increases ...
    • MODEST: a model for energy estimation under spatio-temporal variability 

      Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio (Institute of Electrical and Electronics Engineers (IEEE), 2010)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Estimation of static and dynamic energy of caches is critical for high-performance low-power designs. Commercial CAD tools performing energy estimation statically are not aware of the changing operating and environmental ...