• A runtime heuristic to selectively replicate tasks for application-specific reliability targets 

    Subasi, Omer; Yalcin, Gulay; Zyulkyarov, Ferad; Unsal, Osman Sabri; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2016)
    Text en actes de congrés
    Accés obert
    In this paper we propose a runtime-based selective task replication technique for task-parallel high performance computing applications. Our selective task replication technique is automatic and does not require ...
  • Atomic quake: using transactional memory in an interactive mulitplayer game Server 

    Zyulkyarov, Ferad; Gajinov, Vladimir; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Harris, Tim; Valero Cortés, Mateo (2009)
    Text en actes de congrés
    Accés obert
    Transactional Memory (TM) is being studied widely as a new technique for synchronizing concurrent accesses to shared memory data structures for use in multi-core systems. Much of the initial work on TM has been evaluated ...
  • Designing and modelling selective replication for fault-tolerant HPC applications 

    Subasi, Omer; Yalcin, Gulay; Zyulkyarov, Ferad; Unsal, Osman Sabri; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2017)
    Text en actes de congrés
    Accés obert
    Fail-stop errors and Silent Data Corruptions (SDCs) are the most common failure modes for High Performance Computing (HPC) applications. There are studies that address fail-stop errors and studies that address SDCs. However ...
  • Disaggregated Computing. An Evaluation of Current Trends for Datacentres 

    Meyer, Hugo; Sancho, Jose C.; Quiroga, Josue V.; Zyulkyarov, Ferad; Roca, Damian; Nemirovsky, Mario (Elsevier, 2017)
    Article
    Accés obert
    Next generation data centers will likely be based on the emerging paradigm of disaggregated function-blocks-as-a-unit departing from the current state of mainboard-as-a-unit. Multiple functional blocks or bricks such as ...
  • QuakeTM: Parallelizing a complex serial application using transactional memory 

    Gajinov, Vladimir; Zyulkyarov, Ferad; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Harris, Tim; Valero Cortés, Mateo (2008-11)
    Report de recerca
    Accés obert
    'Is transactional memory useful?' is the question that cannot be answered until we provide substantial applications that can evaluate its capabilities. While existing TM applications can partially answer the above question, ...
  • Unified fault-tolerance framework for hybrid task-parallel message-passing applications 

    Subasi, Omer; Martsinkevich, Tatiana; Zyulkyarov, Ferad; Unsal, Osman; Labarta, Jesús; Cappello, Franck (SAGE Publications, 2016-09-26)
    Article
    Accés restringit per política de l'editorial
    We present a unified fault-tolerance framework for task-parallel message-passing applications to mitigate transient errors. First, we propose a fault-tolerant message-logging protocol that only requires the restart of the ...
  • Unprotected computing: a large-scale study of DRAM raw error rate on a supercomputer 

    Bautista-Gomez, Leonardo; Zyulkyarov, Ferad; Unsal, Osman; McIntosh-Smith, Simon (ACM, 2016-11-13)
    Comunicació de congrés
    Accés obert
    Supercomputers offer new opportunities for scientific computing as they grow in size. However, their growth also poses new challenges. Resilience has been recognized as one of the most pressing issues to solve for extreme ...