Now showing items 1-20 of 95

    • A case study of hybrid dataflow and shared-memory programming models: Dependency-based parallel game engine 

      Gajinov, Vladimir; Eric, Igor; Stojanovic, Saa; Milutinovic, Veljko; Unsal, Osman Sabri; Ayguadé Parra, Eduard; Cristal Kestelman, Adrián (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Restricted access - publisher's policy
      Recently proposed hybrid dataflow and shared memory programming models combine these two underlying models in order to support a wider range of problems naturally. The effectiveness of such hybrid models for parallel ...
    • A Demo of FPGA Aggressive Voltage Downscaling: Power and Reliability Tradeoffs 

      Salami, Behzad; Unsal, Osman Sabri; Cristal Kestelman, Adrián (IEEE, 2018-12-06)
      Conference lecture
      Open Access
      The power consumption of digital circuits, e.g., Field Programmable Gate Arrays (FPGAs), is directly related to their operating supply voltages. On the other hand, usually, chip vendors introduce a conservative voltage ...
    • A fully parameterizable low power design of vector fused multiply-add using active clock-gating techniques 

      Ratkovic, Ivan; Palomar, Oscar; Stanic, Milan; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2016)
      Conference report
      Restricted access - publisher's policy
      The need for power-efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a re-tailoring for the mobile market ...
    • A general guide to applying machine learning to computer architecture 

      Nemirovsky, Daniel; Arkose, Tugberk; Markovic, Nikola; Nemirovsky, Mario; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo (2018)
      Article
      Open Access
      The resurgence of machine learning since the late 1990s has been enabled by significant advances in computing performance and the growth of big data. The ability of these algorithms to detect complex patterns in data which ...
    • A novel FPGA-based high throughput accelerator for binary search trees 

      Melikoglu, Oyku; Ergin, Oguz; Salami, Behzad; Pavón Rivera, Julián; Unsal, Osman Sabri; Cristal Kestelman, Adrián (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference report
      Open Access
      This paper presents a deeply pipelined and massively parallel Binary Search Tree (BST) accelerator for Field Programmable Gate Arrays (FPGAs). Our design relies on the extremely parallel on-chip memory, or Block RAMs (BRAMs) ...
    • A RISC-V simulator and benchmark suite for designing and evaluating vector architectures 

      Ramírez Lazo, Cristóbal; Hernández, César Alejandro; Palomar Pérez, Óscar; Unsal, Osman Sabri; Ramírez Salinas, Marco Antonio; Cristal Kestelman, Adrián (2020-11)
      Article
      Open Access
      Vector architectures lack tools for research. Consider the gem5 simulator, which is possibly the leading platform for computer-system architecture research. Unfortunately, gem5 does not have an available distribution that ...
    • A runtime heuristic to selectively replicate tasks for application-specific reliability targets 

      Subasi, Omer; Yalcin, Gulay; Zyulkyarov, Ferad; Unsal, Osman Sabri; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Conference report
      Open Access
      In this paper we propose a runtime-based selective task replication technique for task-parallel high performance computing applications. Our selective task replication technique is automatic and does not require ...
    • Advanced pattern based memory controller for FPGA based HPC applications 

      Hussain, Tassadaq; Palomar Pérez, Óscar; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Restricted access - publisher's policy
      The ever-increasing complexity of high-performance computing applications limits performance due to memory constraints in FPGAs. To address this issue, we propose the Advanced Pattern based Memory Controller (APMC), which ...
    • AMMC: advance multi-core memory controller 

      Hussain, Tassadaq; Palomar Pérez, Óscar; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference lecture
      Open Access
      In this work, we propose an efficient scheduler and intelligent memory manager known as AMMC (Advanced Multi-Core Memory Controller), which proficiently handles data movement and computational tasks. The proposed AMMC ...
    • An academic RISC-V silicon implementation based on open-source components 

      Abella Ferrer, Jaume; Bulla, Calvin; Cabo Pitarch, Guillem; Cazorla Almeida, Francisco Javier; Cristal Kestelman, Adrián; Doblas Font, Max; Figueras Bagué, Roger; González Trejo, Alberto; Hernández Luz, Carles; Hernández Calderón, César Alejandro; Jiménez Arador, Víctor; Kosmidis, Leonidas; Kostalabros, Ioannis-Vatistas; Langarita Benítez, Rubén; Leyva Santes, Neiel; López Paradís, Guillem; Marimon Illana, Joan; Martínez Martínez, Ricardo; Mendoza Escobar, Jonnatan; Moll Echeto, Francisco de Borja; Moreto Planas, Miquel; Pavón Rivera, Julián; Ramírez Lazo, Cristóbal; Ramírez Salinas, Marco Antonio; Rojas Morales, Carlos; Rubio Sola, Jose Antonio; Ruiz, Abraham Josafat; Sonmez, Nehir; Soria Pardos, Víctor; Teres Teres, Lluis; Unsal, Osman Sabri; Valero Cortés, Mateo; Vargas Valdivieso, Iván; Villa Vargas, Luis Alfonso (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      The design presented in this paper, called preDRAC, is a RISC-V general purpose processor capable of booting Linux jointly developed by BSC, CIC-IPN, IMB-CNM (CSIC), and UPC. The preDRAC processor is the first RISC-V ...
    • An experimental study of reduced-voltage operation in modern FPGAs for neural network acceleration 

      Salami, Behzad; Onural, Erhan Baturay; Yuksel, Ismail Emir; Koc, Fahrettin; Ergin, Oguz; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Sarbazi-Azad, Hamid; Mutlu, Onur (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field ...
    • An integrated vector-scalar design on an in-order ARM core 

      Stanic, Milan; Palomar Pérez, Óscar; Hayes, Timothy; Ratkovic, Ivan; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Valero Cortés, Mateo (2017-07)
      Article
      Open Access
      In the low-end mobile processor market, power, energy, and area budgets are significantly lower than in the server/desktop/laptop/high-end mobile markets. It has been shown that vector processors are a highly energy-efficient ...
    • Architecture for object-oriented programming model 

      Markovic, Nikola; González, Rubén; Unsal, Osman Sabri; Valero Cortés, Mateo; Cristal Kestelman, Adrián (2009)
      External research report
      Open Access
      Current mainstream architectures have ISAs that are not able to maintain all the information provided by the application programmer using a high level programming language. Typically, the information that is lost in compiling ...
    • Atomic quake: using transactional memory in an interactive mulitplayer game Server 

      Zyulkyarov, Ferad; Gajinov, Vladimir; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Harris, Tim; Valero Cortés, Mateo (2009)
      Conference report
      Open Access
      Transactional Memory (TM) is being studied widely as a new technique for synchronizing concurrent accesses to shared memory data structures for use in multi-core systems. Much of the initial work on TM has been evaluated ...
    • Checkpoint restart support for heterogeneous HPC applications 

      Parasyris, Konstantinos; Keller, Kai Rasmus; Bautista Gomez, Leonardo Arturo; Unsal, Osman Sabri (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      As we approach the era of exa-scale computing, fault tolerance is of growing importance. The increasing number of cores as well as the increased complexity of modern heterogenous systems result in substantial decrease of ...
    • Circuit design of a dual-versioning L1 data cache for optimistic concurrency 

      Seyedi, Azam; Armejach, Adrià; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Hur, Ibrahim; Valero Cortés, Mateo (2011)
      External research report
      Open Access
      This paper proposes a novel L1 data cache design with dual-versioning SRAM cells (dvSRAM) for chip multi-processors (CMP) that implement optimistic concurrency proposals. In this new cache architecture, each dvSRAM cell ...
    • Circuit design of a novel adaptable and reliable L1 data cache 

      Seyedi, Azam; Yalcin, Gulay; Unsal, Osman Sabri; Cristal Kestelman, Adrián (2013)
      Conference report
      Open Access
      This paper proposes a novel adaptable and reliable L1 data cache design (Adapcache) with the unique capability of automatically adapting itself for different supply voltage levels and providing the highest reliability. ...
    • Clock gate on abort: Towards energy-efficient hardware transactional memory 

      Sanyal, Sutirtha; Roy, Sourav; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2009)
      Conference report
      Open Access
      Transactional Memory (TM) is an emerging technology which promises to make parallel programming easier compared to earlier lock based approaches. However, as with any form of speculation, Transactional Memory too wastes a ...
    • Commit on overflow 

      Stipic, Srdjan; Armejach, Adrià; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo (2014)
      External research report
      Open Access
      Current commercial CPUs have hardware support for speculative lock elision (SLE). SLE tries to elide the lock by speculatively executing lock protected critical section. If the speculation fails, SLE acquires the lock and ...
    • CRC-based memory reliability for task-parallel HPC applications 

      Subasi, Omer; Unsal, Osman Sabri; Labarta Mancho, Jesús José; Yalcin, Gulay; Cristal Kestelman, Adrián (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Conference report
      Restricted access - publisher's policy
      Memory reliability will be one of the major concerns for future HPC and Exascale systems. This concern is mostly attributed to the expected massive increase in memory capacity and the number of memory devices in Exascale ...