Now showing items 1-17 of 17

    • A hardware/software co-design of K-mer counting using a CAPI-enabled FPGA 

      Haghi, Abbas; Álvarez Martí, Lluc; Polo Bardés, Jorda; Diamantopoulos, Dionysios; Hagleitner, Christoph; Moreto Planas, Miquel (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      Advances in Next Generation Sequencing (NGS) technologies have caused the proliferation of genomic applications to detect DNA mutations and guide personalized medicine. These applications have an enormous computational ...
    • Architectural support for task dependence management with flexible software scheduling 

      Castillo, Emilio; Álvarez Martí, Lluc; Moreto Planas, Miquel; Casas, Marc; Vallejo, Enrique; Bosque, Jose L.; Beivide Palacio, Ramon; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Conference report
      Open Access
      The growing complexity of multi-core architectures has motivated a wide range of software mechanisms to improve the orchestration of parallel executions. Task parallelism has become a very attractive approach thanks to its ...
    • CATA: Criticality aware task acceleration for multicore processors 

      Castillo, Emilio; Moreto Planas, Miquel; Casas, Marc; Álvarez Martí, Lluc; Vallejo, Enrique; Chronaki, Kallia; Badia Sala, Rosa Maria; Bosque Orero, José Luis; Beivide Palacio, Julio Ramón; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2016)
      Conference report
      Open Access
      Managing criticality in task-based programming models opens a wide range of performance and power optimization opportunities in future manycore systems. Criticality aware task schedulers can benefit from these opportunities ...
    • Characterizing the impact of last-level cache replacement policies on big-data workloads 

      Jamet, Alexandre Valentin; Álvarez Martí, Lluc; Jiménez, Daniel A.; Casas Guix, Marc (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Conference report
      Open Access
      The vast disparity between Last Level Cache (LLC) and memory latencies has motivated the need for efficient cache management policies. The computer architecture literature abounds with work on LLC replacement policy. ...
    • Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures 

      Álvarez Martí, Lluc; Vilanova, Lluís; Moreto Planas, Miquel; Casas, Marc; González Tallada, Marc; Martorell Bofill, Xavier; Navarro, Nacho; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2015)
      Conference report
      Open Access
      The increasing number of cores in manycore architectures causes important power and scalability problems in the memory subsystem. One solution is to introduce scratchpad memories alongside the cache hierarchy, forming a ...
    • Hardware-software coherence protocol for the coexistence of caches and local memories 

      Álvarez Martí, Lluc; Vilanova, Lluís; González Tallada, Marc; Martorell Bofill, Xavier; Navarro, Nacho; Ayguadé Parra, Eduard (2015-01-01)
      Article
      Open Access
      Cache coherence protocols limit the scalability of multicore and manycore architectures and are responsible for an important amount of the power consumed in the chip. A good way to alleviate these problems is to introduce ...
    • Intelligent adaptation of hardware knobs for improving performance and power consumption 

      Ortega Carrasco, Cristobal; Álvarez Martí, Lluc; Casas, Marc; Bertran, Ramon; Buyuktosunoglu, Alper; Eichenberger, Alexandre; Bose, Pradip; Moreto Planas, Miquel (Institute of Electrical and Electronics Engineers (IEEE), 2021-01-01)
      Article
      Open Access
      Current microprocessors include several knobs to modify the hardware behavior in order to improve performance, power, and energy under different workload demands. An impractical and time consuming offline profiling is ...
    • Peachy Parallel Assignments (EduHPC 2018) 

      Ayguadé Parra, Eduard; Álvarez Martí, Lluc; Banchelli Gracia, Fabio; Burtscher, Martin; González Escribano, Arturo; Gutiérrez Monge, Julián; Joiner, David A.; Kaeli, David; Previlon, Fritz; Rodríguez Gutiez, Eduardo; Bunde, David P. (Institute of Electrical and Electronics Engineers (IEEE), 2018)
      Conference report
      Open Access
      Peachy Parallel Assignments are a resource for instructors teaching parallel and distributed programming. These are high-quality assignments, previously tested in class, that are readily adoptable. This collection of ...
    • Pushing the envelope on free TLB prefetching 

      Vavouliotis, Georgios; Álvarez Martí, Lluc; Casas, Marc (Barcelona Supercomputing Center, 2021-05)
      Conference report
      Open Access
      Frequent Translation Lookaside Buffer (TLB) misses pose significant performance and energy overheads due to page walks required for fetching the translations. The address translation performance bottleneck is further ...
    • Reducing cache coherence traffic with a NUMA-aware runtime approach 

      Caheny, Paul; Álvarez Martí, Lluc; Derradji, Said; Valero Cortés, Mateo; Moreto Planas, Miquel; Casas Guix, Marc (2018-05)
      Article
      Open Access
      Cache Coherent NUMA (ccNUMA) architectures are a widespread paradigm due to the benefits they provide for scaling core count and memory capacity. Also, the flat memory address space they offer considerably improves ...
    • Runtime-assisted cache coherence deactivation in task parallel programs 

      Caheny, Paul; Álvarez Martí, Lluc; Valero Cortés, Mateo; Moreto Planas, Miquel; Casas, Marc (Association for Computing Machinery (ACM), 2018)
      Conference report
      Open Access
      With increasing core counts, the scalability of directory-based cache coherence has become a challenging problem. To reduce the area and power needs of the directory, recent proposals reduce its size by classifying data ...
    • Runtime-aware architectures 

      Casas Guix, Marc; Moreto Planas, Miquel; Álvarez Martí, Lluc; Castillo Villar, Emilio; Chasapis, Dimitrios; Hayes, Timothy; Jaulmes, Luc; Palomar Pérez, Óscar; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Springer, 2015)
      Conference report
      Open Access
      In the last few years, the traditional ways to keep the increase of hardware performance to the rate predicted by the Moore’s Law have vanished. When uni-cores were the norm, hardware design was decoupled from the software ...
    • Runtime-guided management of scratchpad memories in multicore architectures 

      Álvarez Martí, Lluc; Moreto Planas, Miquel; Casas Guix, Marc; Castillo Villar, Emilio; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2015)
      Conference report
      Open Access
      The increasing number of cores and the anticipated level of heterogeneity in upcoming multicore architectures cause important problems in traditional cache hierarchies. A good way to alleviate these problems is to add ...
    • Runtime-guided management of stacked DRAM memories in task parallel programs 

      Álvarez Martí, Lluc; Casas, Marc; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Valero Cortés, Mateo; Moreto Planas, Miquel (Association for Computing Machinery (ACM), 2018)
      Conference report
      Open Access
      Stacked DRAM memories have become a reality in High-Performance Computing (HPC) architectures. These memories provide much higher bandwidth while consuming less power than traditional off-chip memories, but their limited ...
    • Teaching HPC systems and parallel programming with small-scale clusters 

      Álvarez Martí, Lluc; Ayguadé Parra, Eduard; Mantovani, Filippo (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference report
      Open Access
      In the last decades, the continuous proliferation of High-Performance Computing (HPC) systems and data centers has augmented the demand for expert HPC system designers, administrators, and programmers. For this reason, ...
    • The DeepHealth Toolkit: A unified framework to boost biomedical applications 

      Cancilla, Michele; Canalini, Laura; Bolelli, Federico; Allegretti, Stefano; Carrión Ponz, Salvador; Paredes Palacios, Roberto; Gómez Adrián, Jon A.; Leo, Simone; Piras, Marco Enrico; Pireddu, Luca; Badouh, Asaf; Marco-Sola, Santiago; Álvarez Martí, Lluc; Moreto Planas, Miquel; Grana, Costantino (Institute of Electrical and Electronics Engineers (IEEE), 2021)
      Conference report
      Open Access
      Given the overwhelming impact of machine learning on the last decade, several libraries and frameworks have been developed in recent years to simplify the design and training of neural networks, providing array-based ...
    • Transparent management of scratchpad memories in shared memory programming models 

      Álvarez Martí, Lluc (Universitat Politècnica de Catalunya, 2015-12-16)
      Doctoral thesis
      Open Access
      Cache-coherent shared memory has traditionally been the favorite memory organization for chip multiprocessors thanks to its high programmability. In this organization the cache hierarchy is in charge of moving the data and ...