• Checkpoint restart support for heterogeneous HPC applications 

      Parasyris, Konstantinos; Keller, Kai Rasmus; Bautista Gomez, Leonardo; Unsal, Osman Sabri (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Text en actes de congrés
      Accés obert
      As we approach the era of exa-scale computing, fault tolerance is of growing importance. The increasing number of cores as well as the increased complexity of modern heterogenous systems result in substantial decrease of ...
    • Data prefetching on in-order processors 

      Ortega Carrasco, Cristobal; García Flores, Víctor; Moretó Planas, Miquel; Casas, Marc; Rositoru, Roxana (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Text en actes de congrés
      Accés obert
      Low-power processors have attracted attention due to their energy-efficiency. A large market, such as the mobile one, relies on these processors for this very reason. Even High Performance Computing (HPC) systems are ...
    • Experiences with mobile processors for energy efficient HPC 

      Rajovic, Nikola; Rico Carro, Alejandro; Vipond, James; Gelado Fernandez, Isaac; Puzovic, Nikola; Ramírez Bellido, Alejandro (2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The performance of High Performance Computing (HPC) systems is already limited by their power consumption. The majority of top HPC systems today are built from commodity server components that were designed for maximizing ...
    • Spatial support vector regression to detect silent errors in the exascale era 

      Subasi, Omer; Di, Sheng; Bautista Gomez, Leonardo; Balaprakash, Prasanna; Unsal, Osman Sabri; Labarta Mancho, Jesús José; Cristal Kestelman, Adrián; Cappello, Franck (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Text en actes de congrés
      Accés obert
      As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions ...
    • Teaching HPC systems and parallel programming with small-scale clusters 

      Álvarez Martí, Lluc; Ayguadé Parra, Eduard; Mantovani, Filippo (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Text en actes de congrés
      Accés obert
      In the last decades, the continuous proliferation of High-Performance Computing (HPC) systems and data centers has augmented the demand for expert HPC system designers, administrators, and programmers. For this reason, ...