Now showing items 1-6 of 6

  • Adaptive runtime-assisted block prefetching on chip-multiprocessors 

    García Flores, Víctor; Rico Carro, Alejandro; Villavieja Prados, Carlos; Carpenter, Paul M.; Navarro, Nacho; Ramirez, Alex (2016-04-29)
    Article
    Open Access
    Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the ...
  • Beyond the socket: NUMA-aware GPUs 

    Ugljesa, Milic; Villa, Oreste; Bolotin, Evgeny; Arunkumar, Akhil; Ebrahimi, Eiman; Jaleel, Aamer; Ramirez, Alex; Nellans, David (Association for Computing Machinery, 2017-10)
    Conference lecture
    Open Access
    GPUs achieve high throughput and power efficiency by employing many small single instruction multiple thread (SIMT) cores. To minimize scheduling logic and performance variance they utilize a uniform memory system and ...
  • Rebalancing the core front-end through HPC code analysis 

    Milic, Ugljesa; Carpenter, Paul; Rico, Alejandro; Ramirez, Alex (IEEE, 2016-09-25)
    Conference lecture
    Open Access
    There is a need to increase performance under the same power and area envelope to achieve Exascale technology in high performance computing (HPC). The today's chip multiprocessor (CMP) design is tailored by traditional ...
  • Rebalancing the core front-end through HPC code analysis 

    Milic, Ugljesa; Carpenter, Paul; Rico, Alejandro; Ramirez, Alex (IEEE, 2016-10-10)
    Conference report
    Open Access
    There is a need to increase performance under the same power and area envelope to achieve Exascale technology in high performance computing (HPC). The today's chip multiprocessor (CMP) design is tailored by traditional ...
  • Sharing the instruction cache among lean cores on an asymmetric CMP for HPC applications 

    Milic, Ugljesa; Rico, Alejandro; Carpenter, Paul; Ramirez, Alex (Institute of Electrical and Electronics Engineers (IEEE), 2017-07-13)
    Conference lecture
    Open Access
    High performance computing (HPC) applications have parallel code sections that must scale to large numbers of cores, which makes them sensitive to serial regions. Current supercomputing systems with heterogeneous or ...
  • The Mont-Blanc prototype: an alternative approach for HPC systems 

    Rajovic, Nikola; Rico, Alejandro; Mantovani, Filippo; Ruiz, Daniel; Vlarrubi, Josep O.; Gomez, Constantino; Backes, Luna; Nieto, Diego; Servat, Harald; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Adeniyi-Jones, Chris; Derradji, Said; Gloaguen, Hervé; Lanucara, Piero; Sanna, Nico; Mehaut, Jean-François; Pouget, Kevin; Videau, Brice; Boyer, Eric; Allalen, Momme; Auweter, Axel; Brayford, David; Tafani, Daniele; Weinberg, Volker; Brömmel, Dirk; Halver, René; Meinke, Jan H.; Beivide Palacio, Ramon; Benito, Mariano; Vallejo, Enrique; Valero Cortés, Mateo; Ramirez, Alex (Institute of Electrical and Electronics Engineers (IEEE), 2016)
    Conference report
    Open Access
    High-performance computing (HPC) is recognized as one of the pillars for further progress in science, industry, medicine, and education. Current HPC systems are being developed to overcome emerging architectural challenges ...