Browsing by Author "Radojkovic, Petar"
Now showing items 1-20 of 24
-
Another trip to the wall: how much will stacked DRAM benefit HPC?
Radulovic, Milan; Živanovič, Darko; Ruiz, Daniel; De Supinski, Bronis; McKee, Sally; Radojkovic, Petar; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2015)
Conference report
Restricted access - publisher's policyFirst defined two decades ago, the memory wall remains a fundamental limitation to system performance. Recent innovations in 3D-stacking technology enable DRAM devices with much higher bandwidths than traditional DIMMs. ... -
Characterizing the resource-sharing levels of the UltraSparc T2 processor
Cakarevic, Vladimir; Radojkovic, Petar; Verdú Mulà, Javier; Pajuelo González, Manuel Alejandro; Cazorla Almeida, Francisco Javier; Nemirovsky, Mario; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2009)
Conference report
Restricted access - publisher's policyThread level parallelism (TLP) has become a popular trend to improve processor performance, overcoming the limitations of extracting instruction level parallelism. Each TLP paradigm, such as Simultaneous Multithreading or ... -
DRAM errors in the field: a statistical approach
Živanovič, Darko; Esmaili Dokht, Pouya; Moré, Sergi; Bartolomé, Javier; Carpenter, Paul Matthew; Radojkovic, Petar; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2019)
Conference report
Open AccessThis paper summarizes our two-year study of corrected and uncor-rected errors on the MareNostrum 3 supercomputer, covering 2000 billion MB-hours of DRAM in the field. The study analyzes 4.5 million corrected and 71 uncorrected ... -
Energy efficient HPC on embedded SoCs : optimization techniques for mali GPU
Grasso, Ivan; Radojkovic, Petar; Rajovic, Nikola; Gelado Fernandez, Isaac; Ramírez Bellido, Alejandro (Institute of Electrical and Electronics Engineers (IEEE), 2014)
Conference report
Restricted access - publisher's policyA lot of effort from academia and industry has been invested in exploring the suitability of low-power embedded technologies for HPC. Although state-of-the-art embedded systems-on-chip (SoCs) inherently contain GPUs that ... -
HPC benchmarking: scaling right and looking beyond the average
Radulovic, Milan; Asifuzzaman, Kazi; Carpenter, Paul Matthew; Radojkovic, Petar; Ayguadé Parra, Eduard (Springer, 2018)
Conference report
Open AccessDesigning a balanced HPC system requires an understanding of the dominant performance bottlenecks. There is as yet no well established methodology for a unified evaluation of HPC systems and workloads that quantifies the ... -
Large-memory nodes for energy efficient high-performance computing
Živanovič, Darko; Radulovic, Milan; Llort, German; Zaragoza, David; Strassburg, Janko; Carpenter, Paul M.; Radojkovic, Petar; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2016)
Conference report
Open AccessEnergy consumption is by far the most important contributor to HPC cluster operational costs, and it accounts for a significant share of the total cost of ownership. Advanced energy-saving techniques in HPC components have ... -
Main memory in HPC: do we need more, or could we live with less?
Živanovič, Darko; Pavlovic, Milan; Radulovic, Milan; Shin, Hyunsung; Son, Jongpil; McKee, Sally A.; Carpenter, Paul M.; Radojkovic, Petar; Ayguadé Parra, Eduard (2017-03)
Article
Open AccessAn important aspect of High-Performance Computing (HPC) system design is the choice of main memory capacity. This choice becomes increasingly important now that 3D-stacked memories are entering the market. Compared with ... -
Main memory latency simulation: the missing link
Sánchez Verdejo, Rommel; Asifuzzaman, Kazi; Radulović, Milan; Radojkovic, Petar; Ayguadé Parra, Eduard; Jacob, Bruce (Association for Computing Machinery (ACM), 2018)
Conference report
Open AccessThe community accepted the need for a detailed simulation of main memory. Currently, the CPU simulators are usually coupled with the cycle-accurate main memory simulators. However, coupling CPU and memory simulators is not ... -
Mainstream vs. emerging HPC: metrics, trade-offs and lessons learned
Radulović, Milan; Asifuzzaman, Kazi; Živanovič, Darko; Rajovic, Nikola; Colin de Verdiére, Guillaume; Pleiter, Dirk; Marazakis, Manolis; Kallimanis, Nikolaos; Carpenter, Paul Matthew; Radojkovic, Petar; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 2018)
Conference report
Open AccessVarious servers with different characteristics and architectures are hitting the market, and their evaluation and comparison in terms of HPC features is complex and multidimensional. In this paper, we share our experience ... -
Measuring operating system overhead on CMT processors
Radojkovic, Petar; Cakarevic, Vladimir; Verdú Mulà, Javier; Pajuelo González, Manuel Alejandro; Gioiosa, Roberto; Cazorla Almeida, Francisco Javier; Nemirovsky, Mario; Valero Cortés, Mateo (IEEE Computer Society Publications, 2008)
Conference report
Open AccessNumerous studies have shown that Operating System (OS) noise is one of the reasons for significant performance degradation in clustered architectures. Although many studies examine the OS noise for High Performance Computing ... -
Measuring operating system overhead on Sun UltraSparc T1 processor
Radojkovic, Petar; Cakarevic, Vladimir; Verdú Mulà, Javier; Pajuelo González, Manuel Alejandro; Gioiosa, Roberto; Cazorla Almeida, Francisco Javier; Nemirovsky, Mario; Valero Cortés, Mateo (2009-06)
Conference report
Open AccessNumerous studies have shown that Operating System (OS) noise is one of the reasons for significant performance degradation in clustered architectures. Although many studies examine the OS noise for High Performance Computing, ... -
Overhead of the spin-lock loop in UltraSPARC T2
Cakarevic, Vladimir; Radojkovic, Petar; Cazorla Almeida, Francisco Javier; Gioiosa, Roberto; Nemirovsky, Mario; Valero Cortés, Mateo; Pajuelo González, Manuel Alejandro; Verdú Mulà, Javier (2008-06-04)
Conference report
Open AccessSpin locks are task synchronization mechanism used to provide mutual exclusion to shared software resources. Spin locks have a good performance in several situations over other synchronization mechanisms, i.e., when on ... -
Paving the Way Towards a Highly Energy-Efficient and Highly Integrated Compute Node for the Exascale Revolution: The ExaNoDe Approach
Rigo, Alvise; Pinto, Christian; Pouget, Kevin; Raho, Daniel; Dutoit, Denis; Martinez, Pierre-Yves; Doran, Chris; Benini, Luca; Mavroidis, Iakovos; Marazakis, Manolis; Bartsch, Valeria; Lonsdale, Guy; Pop, Antoniu; Goodacre, John; Colliot, Annaïk; Carpenter, Paul; Radojkovic, Petar; Pleiter, Dirk; Drouin, Dominique; Dupont de Dinechin, Benoît (IEEE, 2017-09-28)
Conference lecture
Open AccessPower consumption and high compute density are the key factors to be considered when building a compute node for the upcoming Exascale revolution. Current architectural design and manufacturing technologies are not able ... -
Performance impact of a slower main memory: a case study of STT-MRAM in HPC
Asifuzzaman, Kazi; Pavlovic, Milan; Radulovic, Milan; Zaragoza, David; Kwon, Ohseong; Ryoo, Kyung-Chang; Radojkovic, Petar (Barcelona Supercomputing Center, 2017-05-04)
Conference report
Open AccessMemory systems are major contributors to the deployment and operational costs of large-scale HPC clusters [1][2][3], as well as one of the most important design parameters that significantly affect system performance. In ... -
Performance impact of a slower main memory: a case study of STT-MRAM in HPC
Asifuzzaman, Kazi; Pavlovic, Milan; Radulovic, Milan; Zaragoza, David; Kwon, Ohseong; Ryoo, Kyung-Chang; Radojkovic, Petar (ACM, 2016-10)
Conference lecture
Open AccessIn high-performance computing (HPC), significant effort is invested in research and development of novel memory technologies. One of them is Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM) --- byte-addressable, ... -
PROFET: modeling system performance and energy without simulating the CPU
Radulovic, Milan; Sánchez-Verdejo, Rommel; Carpenter, Paul Matthew; Radojkovic, Petar; Jacob, Bruce; Ayguadé Parra, Eduard (2019-06)
Article
Open AccessThe approaching end of DRAM scaling and expansion of emerging memory technologies is motivating a lot of research in future memory systems. Novel memory systems are typically explored by hardware simulators that are slow ... -
Rethinking cycle accurate DRAM simulation
Li, Shang; Sánchez Verdejo, Rommel; Radojkovic, Petar; Jacob, Bruce (Association for Computing Machinery (ACM), 2019)
Conference report
Open AccessCycle accurate DRAM simulations have been the dominating architecture simulation model for DRAM for a long time. Although accurate, its poor simulation speed has not improved for years while a lot of other architecture ... -
STT-MRAM for real-time embedded systems: performance and WCET implications
Asifuzzaman, Kazi; Fernández, Mikel; Radojkovic, Petar; Abella Ferrer, Jaume; Cazorla Almeida, Francisco Javier (Association for Computing Machinery (ACM), 2019)
Conference report
Open AccessSTT-MRAM is an emerging non-volatile memory quickly approaching DRAM in terms of capacity, frequency and device size. Intensified efforts in STT-MRAM research by the memory manufacturers may indicate a revolution with ... -
Thread assignment in multicore/multithreaded processors: A statistical approach
Radojkovic, Petar; Carpenter, Paul M.; Moreto Planas, Miquel; Cakarevic, Vladimir; Verdú Mulà, Javier; Pajuelo González, Manuel Alejandro; Cazorla Almeida, Francisco Javier; Nemirovsky, Mario; Valero Cortés, Mateo (2016-01-01)
Article
Open AccessThe introduction of multicore/multithreaded processors, comprised of a large number of hardware contexts (virtual CPUs) that share resources at multiple levels, has made process scheduling, in particular assignment of ... -
Thread assignment of multithreaded network applications in multicore/multithreaded processors
Radojkovic, Petar; Cakarevic, Vladimir; Verdú Mulà, Javier; Pajuelo González, Manuel Alejandro; Cazorla Almeida, Francisco Javier; Nemirovsky, Mario; Valero Cortés, Mateo (2013-12)
Article
Open AccessThe introduction of multithreaded processors comprised of a large number of cores with many shared resources makes thread scheduling, and in particular optimal assignment of running threads to processor hardware contexts ...