Now showing items 1-20 of 26

    • Another trip to the wall: how much will stacked DRAM benefit HPC? 

      Radulovic, Milan; Živanovič, Darko; Ruiz, Daniel; De Supinski, Bronis; McKee, Sally; Radojkovic, Petar; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2015)
      Conference report
      Restricted access - publisher's policy
      First defined two decades ago, the memory wall remains a fundamental limitation to system performance. Recent innovations in 3D-stacking technology enable DRAM devices with much higher bandwidths than traditional DIMMs. ...
    • Characterizing the resource-sharing levels of the UltraSparc T2 processor 

      Cakarevic, Vladimir; Radojkovic, Petar; Verdú Mulà, Javier; Pajuelo González, Manuel Alejandro; Cazorla Almeida, Francisco Javier; Nemirovsky, Mario; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2009)
      Conference report
      Restricted access - publisher's policy
      Thread level parallelism (TLP) has become a popular trend to improve processor performance, overcoming the limitations of extracting instruction level parallelism. Each TLP paradigm, such as Simultaneous Multithreading or ...
    • Cost-aware prediction of uncorrected DRAM errors in the field 

      Boixaderas Coderch, Isaac; Živanovič, Darko; Moré Codina, Sergi; Bartolomé Rodríguez, Javier; Vicente Dorca, David; Casas Guix, Marc; Carpenter, Paul Matthew; Radojkovic, Petar; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      This paper presents and evaluates a method to predict DRAM uncorrected errors, a leading cause of hardware failures in large-scale HPC clusters. The method uses a random forest classifier, which was trained and evaluated ...
    • Cost-aware prediction of uncorrected DRAM errors in the field 

      Boixaderas, Isaac; Carpenter, Paul; Radojkovic, Petar; Ayguadé Parra, Eduard (Barcelona Supercomputing Center, 2021-05)
      Conference report
      Open Access
      One of the main causes of hardware failure in large-scale clusters is an uncorrected error in main memory [1]–[4]. Node failures are especially problematic in high-performance computing (HPC), where a single tightly-coupled ...
    • DRAM errors in the field: a statistical approach 

      Živanovič, Darko; Esmaili Dokht, Pouya; Moré, Sergi; Bartolomé, Javier; Carpenter, Paul Matthew; Radojkovic, Petar; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2019)
      Conference report
      Open Access
      This paper summarizes our two-year study of corrected and uncor-rected errors on the MareNostrum 3 supercomputer, covering 2000 billion MB-hours of DRAM in the field. The study analyzes 4.5 million corrected and 71 uncorrected ...
    • Energy efficient HPC on embedded SoCs : optimization techniques for mali GPU 

      Grasso, Ivan; Radojkovic, Petar; Rajovic, Nikola; Gelado Fernandez, Isaac; Ramírez Bellido, Alejandro (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Restricted access - publisher's policy
      A lot of effort from academia and industry has been invested in exploring the suitability of low-power embedded technologies for HPC. Although state-of-the-art embedded systems-on-chip (SoCs) inherently contain GPUs that ...
    • HPC benchmarking: scaling right and looking beyond the average 

      Radulovic, Milan; Asifuzzaman, Kazi; Carpenter, Paul Matthew; Radojkovic, Petar; Ayguadé Parra, Eduard (Springer, 2018)
      Conference report
      Open Access
      Designing a balanced HPC system requires an understanding of the dominant performance bottlenecks. There is as yet no well established methodology for a unified evaluation of HPC systems and workloads that quantifies the ...
    • Large-memory nodes for energy efficient high-performance computing 

      Živanovič, Darko; Radulovic, Milan; Llort, German; Zaragoza, David; Strassburg, Janko; Carpenter, Paul M.; Radojkovic, Petar; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2016)
      Conference report
      Open Access
      Energy consumption is by far the most important contributor to HPC cluster operational costs, and it accounts for a significant share of the total cost of ownership. Advanced energy-saving techniques in HPC components have ...
    • Main memory in HPC: do we need more, or could we live with less? 

      Živanovič, Darko; Pavlovic, Milan; Radulovic, Milan; Shin, Hyunsung; Son, Jongpil; McKee, Sally A.; Carpenter, Paul M.; Radojkovic, Petar; Ayguadé Parra, Eduard (2017-03)
      Article
      Open Access
      An important aspect of High-Performance Computing (HPC) system design is the choice of main memory capacity. This choice becomes increasingly important now that 3D-stacked memories are entering the market. Compared with ...
    • Main memory latency simulation: the missing link 

      Sánchez Verdejo, Rommel; Asifuzzaman, Kazi; Radulović, Milan; Radojkovic, Petar; Ayguadé Parra, Eduard; Jacob, Bruce (Association for Computing Machinery (ACM), 2018)
      Conference report
      Open Access
      The community accepted the need for a detailed simulation of main memory. Currently, the CPU simulators are usually coupled with the cycle-accurate main memory simulators. However, coupling CPU and memory simulators is not ...
    • Mainstream vs. emerging HPC: metrics, trade-offs and lessons learned 

      Radulović, Milan; Asifuzzaman, Kazi; Živanovič, Darko; Rajovic, Nikola; Colin de Verdiére, Guillaume; Pleiter, Dirk; Marazakis, Manolis; Kallimanis, Nikolaos; Carpenter, Paul Matthew; Radojkovic, Petar; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Conference report
      Open Access
      Various servers with different characteristics and architectures are hitting the market, and their evaluation and comparison in terms of HPC features is complex and multidimensional. In this paper, we share our experience ...
    • Measuring operating system overhead on CMT processors 

      Radojkovic, Petar; Cakarevic, Vladimir; Verdú Mulà, Javier; Pajuelo González, Manuel Alejandro; Gioiosa, Roberto; Cazorla Almeida, Francisco Javier; Nemirovsky, Mario; Valero Cortés, Mateo (IEEE Computer Society Publications, 2008)
      Conference report
      Open Access
      Numerous studies have shown that Operating System (OS) noise is one of the reasons for significant performance degradation in clustered architectures. Although many studies examine the OS noise for High Performance Computing ...
    • Measuring operating system overhead on Sun UltraSparc T1 processor 

      Radojkovic, Petar; Cakarevic, Vladimir; Verdú Mulà, Javier; Pajuelo González, Manuel Alejandro; Gioiosa, Roberto; Cazorla Almeida, Francisco Javier; Nemirovsky, Mario; Valero Cortés, Mateo (2009-06)
      Conference report
      Open Access
      Numerous studies have shown that Operating System (OS) noise is one of the reasons for significant performance degradation in clustered architectures. Although many studies examine the OS noise for High Performance Computing, ...
    • Overhead of the spin-lock loop in UltraSPARC T2 

      Cakarevic, Vladimir; Radojkovic, Petar; Cazorla Almeida, Francisco Javier; Gioiosa, Roberto; Nemirovsky, Mario; Valero Cortés, Mateo; Pajuelo González, Manuel Alejandro; Verdú Mulà, Javier (2008-06-04)
      Conference report
      Open Access
      Spin locks are task synchronization mechanism used to provide mutual exclusion to shared software resources. Spin locks have a good performance in several situations over other synchronization mechanisms, i.e., when on ...
    • Paving the Way Towards a Highly Energy-Efficient and Highly Integrated Compute Node for the Exascale Revolution: The ExaNoDe Approach 

      Rigo, Alvise; Pinto, Christian; Pouget, Kevin; Raho, Daniel; Dutoit, Denis; Martinez, Pierre-Yves; Doran, Chris; Benini, Luca; Mavroidis, Iakovos; Marazakis, Manolis; Bartsch, Valeria; Lonsdale, Guy; Pop, Antoniu; Goodacre, John; Colliot, Annaïk; Carpenter, Paul; Radojkovic, Petar; Pleiter, Dirk; Drouin, Dominique; Dupont de Dinechin, Benoît (IEEE, 2017-09-28)
      Conference lecture
      Open Access
      Power consumption and high compute density are the key factors to be considered when building a compute node for the upcoming Exascale revolution. Current architectural design and manufacturing technologies are not able ...
    • Performance impact of a slower main memory: a case study of STT-MRAM in HPC 

      Asifuzzaman, Kazi; Pavlovic, Milan; Radulovic, Milan; Zaragoza, David; Kwon, Ohseong; Ryoo, Kyung-Chang; Radojkovic, Petar (ACM, 2016-10)
      Conference lecture
      Open Access
      In high-performance computing (HPC), significant effort is invested in research and development of novel memory technologies. One of them is Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM) --- byte-addressable, ...
    • Performance impact of a slower main memory: a case study of STT-MRAM in HPC 

      Asifuzzaman, Kazi; Pavlovic, Milan; Radulovic, Milan; Zaragoza, David; Kwon, Ohseong; Ryoo, Kyung-Chang; Radojkovic, Petar (Barcelona Supercomputing Center, 2017-05-04)
      Conference report
      Open Access
      Memory systems are major contributors to the deployment and operational costs of large-scale HPC clusters [1][2][3], as well as one of the most important design parameters that significantly affect system performance. In ...
    • PROFET: modeling system performance and energy without simulating the CPU 

      Radulovic, Milan; Sánchez-Verdejo, Rommel; Carpenter, Paul Matthew; Radojkovic, Petar; Jacob, Bruce; Ayguadé Parra, Eduard (2019-06)
      Article
      Open Access
      The approaching end of DRAM scaling and expansion of emerging memory technologies is motivating a lot of research in future memory systems. Novel memory systems are typically explored by hardware simulators that are slow ...
    • Rethinking cycle accurate DRAM simulation 

      Li, Shang; Sánchez Verdejo, Rommel; Radojkovic, Petar; Jacob, Bruce (Association for Computing Machinery (ACM), 2019)
      Conference report
      Open Access
      Cycle accurate DRAM simulations have been the dominating architecture simulation model for DRAM for a long time. Although accurate, its poor simulation speed has not improved for years while a lot of other architecture ...
    • STT-MRAM for real-time embedded systems: performance and WCET implications 

      Asifuzzaman, Kazi; Fernández, Mikel; Radojkovic, Petar; Abella Ferrer, Jaume; Cazorla Almeida, Francisco Javier (Association for Computing Machinery (ACM), 2019)
      Conference report
      Open Access
      STT-MRAM is an emerging non-volatile memory quickly approaching DRAM in terms of capacity, frequency and device size. Intensified efforts in STT-MRAM research by the memory manufacturers may indicate a revolution with ...