Now showing items 1-20 of 121

    • A case for resource-conscious out-of-order processors 

      Cristal Kestelman, Adrián; Martínez, José F; Llosa Espuny, José Francisco; Valero Cortés, Mateo (2003-12)
      Article
      Open Access
      Modern out-of-order processors tolerate long-latency memory operations by supporting a large number of in-flight instructions. This is achieved in part through proper sizing of critical resources, such as register files ...
    • A case study of hybrid dataflow and shared-memory programming models: Dependency-based parallel game engine 

      Gajinov, Vladimir; Eric, Igor; Stojanovic, Saa; Milutinovic, Veljko; Unsal, Osman Sabri; Ayguadé Parra, Eduard; Cristal Kestelman, Adrián (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Restricted access - publisher's policy
      Recently proposed hybrid dataflow and shared memory programming models combine these two underlying models in order to support a wider range of problems naturally. The effectiveness of such hybrid models for parallel ...
    • A content aware integer register file organization 

      González García, Rubén; Cristal Kestelman, Adrián; Ortega Fernández, Daniel; Veidenbaum, Alex; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Conference report
      Open Access
      A register file is a critical component of a modern superscalar processor. It has a large number of entries and read/write ports in order to enable high levels of instruction parallelism. As a result, the register file's ...
    • A decoupled KILO-instruction processor 

      Pericàs Gleim, Miquel; Cristal Kestelman, Adrián; González García, Rubén; Jiménez, Daniel A.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2006)
      Conference report
      Open Access
      Building processors with large instruction windows has been proposed as a mechanism for overcoming the memory wall, but finding a feasible and implementable design has been an elusive goal. Traditional processors are ...
    • A Demo of FPGA Aggressive Voltage Downscaling: Power and Reliability Tradeoffs 

      Salami, Behzad; Unsal, Osman Sabri; Cristal Kestelman, Adrián (IEEE, 2018-12-06)
      Conference lecture
      Open Access
      The power consumption of digital circuits, e.g., Field Programmable Gate Arrays (FPGAs), is directly related to their operating supply voltages. On the other hand, usually, chip vendors introduce a conservative voltage ...
    • A distributed processor state management architecture for large-window processors 

      González, Isidro; Galluzzi, Marco; Veidenbaum, Alexander V.; Ramírez, Marco Antonio; Cristal Kestelman, Adrián; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Conference report
      Open Access
      Processor architectures with large instruction windows have been proposed to expose more instruction-level parallelism (ILP) and increase performance. Some of the proposed architectures replace a re-order buffer (ROB) with ...
    • A flexible heterogeneous multi-core architecture 

      Pericàs Gleim, Miquel; Cristal Kestelman, Adrián; Cazorla, Francisco; González García, Rubén; Jiménez, Daniel A.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
      Conference report
      Open Access
      Multi-core processors naturally exploit thread-level parallelism (TLP). However, extracting instruction-level parallelism (ILP) from individual applications or threads is still a challenge as application mixes in this ...
    • A fully parameterizable low power design of vector fused multiply-add using active clock-gating techniques 

      Ratkovic, Ivan; Palomar, Oscar; Stanic, Milan; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2016)
      Conference report
      Restricted access - publisher's policy
      The need for power-efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a re-tailoring for the mobile market ...
    • A general guide to applying machine learning to computer architecture 

      Nemirovsky, Daniel; Arkose, Tugberk; Markovic, Nikola; Nemirovsky, Mario; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo (2018)
      Article
      Open Access
      The resurgence of machine learning since the late 1990s has been enabled by significant advances in computing performance and the growth of big data. The ability of these algorithms to detect complex patterns in data which ...
    • A new pointer-based instruction queue design and its power-performance evaluation 

      Ramírez, Marco A; Cristal Kestelman, Adrián; Veidenbaum, Alexander V; Villa, Luis; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Conference report
      Open Access
      Instruction queues consume a significant amount of power in a high-performance processor. The wakeup logic delay is also a critical timing parameter. This paper compares a commonly used CAM-based instruction queue organization ...
    • A novel architecture for large windows processors 

      González, Isidro; Galluzzi, Marco; Veidenbaum, Alex; Ramírez, Marco Antonio; Cristal Kestelman, Adrián; Valero Cortés, Mateo (2007-11)
      External research report
      Open Access
      Several processor architectures with large instruction windows have been proposed. They improve performance by maintaining hundreds of instructions in flight to increase the level of instruction parallelism (ILP). Such ...
    • A novel FPGA-based high throughput accelerator for binary search trees 

      Melikoglu, Oyku; Ergin, Oguz; Salami, Behzad; Pavón Rivera, Julián; Unsal, Osman Sabri; Cristal Kestelman, Adrián (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference report
      Open Access
      This paper presents a deeply pipelined and massively parallel Binary Search Tree (BST) accelerator for Field Programmable Gate Arrays (FPGAs). Our design relies on the extremely parallel on-chip memory, or Block RAMs (BRAMs) ...
    • A RISC-V simulator and benchmark suite for designing and evaluating vector architectures 

      Ramírez Lazo, Cristóbal; Hernández, César Alejandro; Palomar Pérez, Óscar; Unsal, Osman Sabri; Ramírez Salinas, Marco Antonio; Cristal Kestelman, Adrián (2020-11)
      Article
      Open Access
      Vector architectures lack tools for research. Consider the gem5 simulator, which is possibly the leading platform for computer-system architecture research. Unfortunately, gem5 does not have an available distribution that ...
    • A two level load/store queue based on execution locality 

      Pericàs Gleim, Miquel; Cristal Kestelman, Adrián; Cazorla, Francisco; González García, Rubén; Veidenbaum, Alexander V; Jiménez, Daniel A.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Conference report
      Open Access
      Multicore processors have emerged as a powerful platform on which to efficiently exploit thread-level parallelism (TLP). However, due to Amdahl’s Law, such designs will be increasingly limited by the remaining sequential ...
    • Accelerating Hash-Based Query Processing Operations on FPGAs by a Hash Table Caching Technique 

      Salami, Behzad; Arcas-Abella, Oriol; Sonmez, Nehir; Unsal, Osman; Cristal Kestelman, Adrián (Springer International Publishing, 2017-04-29)
      Conference lecture
      Open Access
      Extracting valuable information from the rapidly growing field of Big Data faces serious performance constraints, especially in the software-based database management systems (DBMS). In a query processing system, hash-based ...
    • Advanced pattern based memory controller for FPGA based HPC applications 

      Hussain, Tassadaq; Palomar Pérez, Óscar; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Restricted access - publisher's policy
      The ever-increasing complexity of high-performance computing applications limits performance due to memory constraints in FPGAs. To address this issue, we propose the Advanced Pattern based Memory Controller (APMC), which ...
    • AMMC: advance multi-core memory controller 

      Hussain, Tassadaq; Palomar Pérez, Óscar; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference lecture
      Open Access
      In this work, we propose an efficient scheduler and intelligent memory manager known as AMMC (Advanced Multi-Core Memory Controller), which proficiently handles data movement and computational tasks. The proposed AMMC ...
    • An academic RISC-V silicon implementation based on open-source components 

      Abella Ferrer, Jaume; Bulla, Calvin; Cabo Pitarch, Guillem; Cazorla Almeida, Francisco Javier; Cristal Kestelman, Adrián; Doblas Font, Max; Figueras Bagué, Roger; González Trejo, Alberto; Hernández Luz, Carles; Hernández Calderón, César Alejandro; Jiménez Arador, Víctor; Kosmidis, Leonidas; Kostalampros, Ioannis-Vatistas; Langarita Benítez, Rubén; Leyva Santes, Neiel; López Paradís, Guillem; Marimon Illana, Joan; Martínez Martínez, Ricardo; Mendoza Escobar, Jonnatan; Moll Echeto, Francisco de Borja; Moreto Planas, Miquel; Pavón Rivera, Julián; Ramírez Lazo, Cristóbal; Ramírez Salinas, Marco Antonio; Rojas Morales, Carlos; Rubio Sola, Jose Antonio; Ruiz, Abraham Josafat; Sonmez, Nehir; Soria Pardos, Víctor; Teres Teres, Lluis; Unsal, Osman Sabri; Valero Cortés, Mateo; Vargas Valdivieso, Iván; Villa Vargas, Luis Alfonso (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      The design presented in this paper, called preDRAC, is a RISC-V general purpose processor capable of booting Linux jointly developed by BSC, CIC-IPN, IMB-CNM (CSIC), and UPC. The preDRAC processor is the first RISC-V ...
    • An empirical evaluation of High-Level Synthesis languages and tools for database acceleration 

      Arcas Abella, Oriol; Ndu, Geoffrey; Sönmez, Nehir; Ghasempour, Mohsen; Armejach, Adrià; Navaridas, Javier; Song, Wei; Mawer, John; Cristal Kestelman, Adrián; Lujan, Mikel (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Open Access
      High Level Synthesis (HLS) languages and tools are emerging as the most promising technique to make FPGAs more accessible to software developers. Nevertheless, picking the most suitable HLS for a certain class of algorithms ...
    • An experimental study of reduced-voltage operation in modern FPGAs for neural network acceleration 

      Salami, Behzad; Onural, Erhan Baturay; Yuksel, Ismail Emir; Koc, Fahrettin; Ergin, Oguz; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Sarbazi-Azad, Hamid; Mutlu, Onur (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field ...