Now showing items 1-20 of 33

  • Active measurement of memory resource consumption 

    Casas, Marc; Bronevetsky, Greg (IEEE, 2014)
    Conference report
    Open Access
    Hierarchical memory is a cornerstone of modern hardware design because it provides high memory performance and capacity at a low cost. However, the use of multiple levels of memory and complex cache management policies ...
  • Active Measurement of the Impact of Network Switch Utilization on Application Performance 

    Casas, Marc; Bronevetsky, Greg (IEEE, 2014)
    Conference report
    Open Access
    Inter-node networks are a key capability of High-Performance Computing (HPC) systems that differentiates them from less capable classes of machines. However, in spite of their very high performance, the increasing ...
  • Approximating a Multi-Grid Solver 

    Le Fèvre, Valentin; Bautista-Gomez, Leonardo; Unsal, Osman; Casas, Marc (IEEE, 2019-02-14)
    Conference lecture
    Open Access
    Multi-grid methods are numerical algorithms used in parallel and distributed processing. The main idea of multigrid solvers is to speedup the convergence of an iterative method by reducing the problem to a coarser grid a ...
  • Architectural support for task dependence management with flexible software scheduling 

    Castillo, Emilio; Álvarez Martí, Lluc; Moreto Planas, Miquel; Casas, Marc; Vallejo, Enrique; Bosque, Jose L.; Beivide Palacio, Ramon; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Conference report
    Open Access
    The growing complexity of multi-core architectures has motivated a wide range of software mechanisms to improve the orchestration of parallel executions. Task parallelism has become a very attractive approach thanks to its ...
  • Asynchronous and exact forward recovery for detected errors in iterative solvers 

    Jaulmes, Luc Etienne; Casas, Marc; Moreto Planas, Miquel; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2018-03-19)
    Article
    Open Access
    Current trends and projections show that faults in computer systems become increasingly common. Such errors may be detected, and possibly corrected transparently, e.g. by Error Correcting Codes (ECC). For a program to be ...
  • ATM: approximate task memoization in the runtime system 

    Brumar, Iulian; Casas, Marc; Moreto Planas, Miquel; Valero Cortés, Mateo; Sohi, Gurindar S. (Institute of Electrical and Electronics Engineers (IEEE), 2017)
    Conference report
    Open Access
    Redundant computations appear during the execution of real programs. Multiple factors contribute to these unnecessary computations, such as repetitive inputs and patterns, calling functions with the same parameters or bad ...
  • CATA: Criticality aware task acceleration for multicore processors 

    Castillo, Emilio; Moreto Planas, Miquel; Casas, Marc; Álvarez Martí, Lluc; Vallejo, Enrique; Chronaki, Kallia; Badia Sala, Rosa Maria; Bosque Orero, José Luis; Beivide Palacio, Julio Ramón; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2016)
    Conference report
    Open Access
    Managing criticality in task-based programming models opens a wide range of performance and power optimization opportunities in future manycore systems. Criticality aware task schedulers can benefit from these opportunities ...
  • Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures 

    Álvarez Martí, Lluc; Vilanova, Lluís; Moreto Planas, Miquel; Casas, Marc; González Tallada, Marc; Martorell Bofill, Xavier; Navarro, Nacho; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2015)
    Conference report
    Open Access
    The increasing number of cores in manycore architectures causes important power and scalability problems in the memory subsystem. One solution is to introduce scratchpad memories alongside the cache hierarchy, forming a ...
  • Design Space Exploration of Next-Generation HPC Machines 

    Gómez, Constantino; Martínez, Francesc; Armejach, Adrià; Moretó, Miquel; Mantovani, Filippo; Casas, Marc (2019)
    External research report
    Open Access
    The landscape of High Performance Computing (HPC) system architectures keeps expanding with new technologies and increased complexity. With the goal of improving the efficiency of next-generation large HPC systems, ...
  • Evaluating Scientific Workflow Execution on an Asymmetric Multicore Processor 

    Pietri, Ilia; Zhuang, Sicong; Casas, Marc; Moretó, Miquel; Sakellariou, Rizos (Springer, 2018-02)
    Conference lecture
    Open Access
    Asymmetric multicore architectures that integrate different types of cores are emerging as a potential solution for good performance and power efficiency. Although scheduling can be improved by utilizing an appropriate set ...
  • Evaluation of HPC applications’ Memory Resource Consumption via Active Measurement 

    Casas, Marc; Bronevetsky, Greg (IEE, 2016)
    Article
    Open Access
    As the number of compute cores per chip continues to rise faster than the total amount of available memory, applications will become increasingly starved for memory storage capacity and bandwidth, making the problem of ...
  • Exploration of architectural parameters for future HPC systems 

    Gómez, Constantino; Martínez, Francesc; Armejach Sanosa, Adrià; Casas, Marc; Mantovani, Filippo; Moreto Planas, Miquel (Barcelona Supercomputing Center, 2019-05-07)
    Conference report
    Open Access
  • Graph partitioning applied to DAG scheduling to reduce NUMA effects 

    Sánchez Barrera, Isaac; Casas, Marc; Moreto Planas, Miquel; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2018)
    Conference lecture
    Open Access
    The complexity of shared memory systems is becoming more relevant as the number of memory domains increases, with different access latencies and bandwidth rates depending on the proximity between the cores and the devices ...
  • How can we improve energy efficiency through user-directed vectorization and task-based parallelization? 

    Caminal, Helena; Caballero, Diego; Cebrián, Juan M.; Ferrer, Roger; Casas, Marc; Moreto Planas, Miquel; Martorell Bofill, Xavier; Valero Cortés, Mateo (Barcelona Supercomputing Center, 2015-05-05)
    Open Access
    Heterogeneity, parallelization and vectorization are key techniques to improve the performance and energy efficiency of modern computing systems. However, programming and maintaining code for these architectures poses a ...
  • Improving scalability of task-based programs 

    Brumar, Iulian; Casas, Marc; Moreto Planas, Miquel (Barcelona Supercomputing Center, 2015-05-05)
    Conference report
    Open Access
    In a multi-core era, parallel programming allows further performance improvements, but with an important programmability cost. We envision that the best approach to parallel programming that can exceed the programability, ...
  • iQ: an efficient and flexible queue-based simulation framework 

    Roca, Damian; Nemirovsky, Daniel; Casas, Marc; Moreto Planas, Miquel; Valero Cortés, Mateo; Nemirovsky, Mario (Institute of Electrical and Electronics Engineers (IEEE), 2017)
    Conference report
    Open Access
    Conventional system simulators are readily used by computer architects to design and evaluate their processor designs. These simulators provide reasonable levels of accuracy and execution detail but suffer from long ...
  • Iteration-fusing conjugate gradient 

    Zhuang, Sicong; Casas, Marc (Association for Computing Machinery (ACM), 2017-06)
    Conference lecture
    Open Access
    This paper presents the Iteration-Fusing Conjugate Gradient (IFCG) approach which is an evolution of the Conjugate Gradient method that consists in i) letting computations from different iterations to overlap between them ...
  • MUSA: a multi-level simulation approach for next-generation HPC machines 

    Grass, Thomas; Allande, César; Armejach, Adrià; Rico, Alejandro; Ayguadé Parra, Eduard; Labarta, Jesús; Valero Cortés, Mateo; Casas, Marc; Moreto Planas, Miquel (Institute of Electrical and Electronics Engineers (IEEE), 2016)
    Conference report
    Restricted access - publisher's policy
    The complexity of High Performance Computing (HPC) systems is increasing in the number of components and their heterogeneity. Interactions between software and hardware involve many different aspects which are typically ...
  • On the maturity of parallel applications for asymmetric multi-core processors 

    Chronaki, Kallia; Moreto Planas, Miquel; Casas, Marc; Rico, Alejandro; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Elsevier, 2019-05-01)
    Article
    Restricted access - publisher's policy
    Asymmetric multi-cores (AMCs) are a successful architectural solution for both mobile devices and supercomputers. By maintaining two types of cores (fast and slow) AMCs are able to provide high performance under the facility ...
  • Performance and energy effects on task-based parallelized applications: User-directed versus manual vectorization 

    Caminal Pallarés, Helena; Caballero de Gea, Diego; Cebrián González, Juan Manuel; Ferrer, Roger; Casas, Marc; Moreto Planas, Miquel; Martorell Bofill, Xavier; Valero Cortés, Mateo (2018-06)
    Article
    Open Access
    Heterogeneity, parallelization and vectorization are key techniques to improve the performance and energy efficiency of modern computing systems. However, programming and maintaining code for these architectures poses a ...