Ara es mostren els items 1-20 de 26

    • A case for malleable thread-level linear algebra libraries: The LU factorization with partial pivoting 

      Catalán Pallarés, Sandra; Herrero Zaragoza, José Ramón; Quintana Ortí, Enrique Salvador; Rodríguez Sánchez, Rafael; Van De Geijn, Robert (Institute of Electrical and Electronics Engineers (IEEE), 2019-01-31)
      Article
      Accés obert
      We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems. Both techniques target ...
    • A flexible heterogeneous multi-core architecture 

      Pericàs Gleim, Miquel; Cristal Kestelman, Adrián; Cazorla, Francisco; González García, Rubén; Jiménez, Daniel A.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
      Text en actes de congrés
      Accés obert
      Multi-core processors naturally exploit thread-level parallelism (TLP). However, extracting instruction-level parallelism (ILP) from individual applications or threads is still a challenge as application mixes in this ...
    • A low-complexity, high-performance fetch unit for simultaneous multithreading processors 

      Falcón Samper, Ayose Jesús; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Text en actes de congrés
      Accés obert
      Simultaneous multithreading (SMT) is an architectural technique that allows for the parallel execution of several threads simultaneously. Fetch performance has been identified as the most important bottleneck for SMT ...
    • Balancing HPC applications through smart allocation of resources in MT processors 

      Boneti, Carlos; Gioiosa, Roberto; Cazorla, Francisco; Corbalán González, Julita; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Text en actes de congrés
      Accés obert
      Many studies have shown that load imbalancing causes significant performance degradation in High Performance Computing (HPC) applications. Nowadays, Multi-Threaded (MT1) processors are widely used in HPC for their good ...
    • Branch classification to control instruction fetch in simultaneous multithreaded architectures 

      Knijnenburg, Peter M.W.; Ramírez Bellido, Alejandro; Latorre Salinas, Fernando; Larriba Pey, Josep; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2002)
      Text en actes de congrés
      Accés obert
      In simultaneous multithreaded architectures many separate threads are running concurrently, sharing processor resources, thereby realizing a high utilization rate of the available hardware. However, this also implies that ...
    • Dcache Warn: an I-fetch policy to increase SMT efficiency 

      Cazorla Almeida, Francisco Javier; Ramírez Bellido, Alejandro; Valero Cortés, Mateo; Fernandez Garcia, Enrique (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Text en actes de congrés
      Accés obert
      Simultaneous multithreading (SMT) processors increase performance by executing instructions from multiple threads simultaneously. These threads share the processor's resources, but also compete for them. In this environment, ...
    • DLP+TLP processors for the next generation of media workloads 

      Corbal San Adrián, Jesús; Espasa Sans, Roger; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2001)
      Text en actes de congrés
      Accés obert
      Future media workloads will require about two levels of magnitude the performance achieved by current general purpose processors. High uni-threaded performance will be needed to accomplish real-time constraints together ...
    • HARP: Adaptive abort recurrence prediction for Hardware Transactional Memory 

      Armejach Sanosa, Adrià; Negi, Anurag; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Stenström, Per; Harris, Tim (Institute of Electrical and Electronics Engineers (IEEE), 2013)
      Text en actes de congrés
      Accés obert
      Hardware Transactional Memory (HTM) exposes parallelism by allowing possibly conflicting sections of code, called transactions, to execute concurrently in multithreaded applications. However, conflicts among concurrent ...
    • HPC system software for regular and irregular parallel applications 

      Morari, Alessandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The upcoming generation of system software for High Performance Computing is expected to provide a richer set of functionalities without compromising application performance. This Ph.D. thesis addresses the problem of ...
    • MLP-aware dynamic cache partitioning 

      Moretó Planas, Miquel; Cazorla Almeida, Francisco Javier; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
      Comunicació de congrés
      Accés obert
      The limitation imposed by instruction-level parallelism (ILP) has motivated the use of thread-level parallelism (TLP) as a common strategy for improving processor performance. TLP paradigms such as simultaneous multithreading ...
    • MT-SBST: self-test optimization in multithreaded multicore architectures 

      Foutris, Nikos; Psarakis, M.; Gizopoulos, Dimitris; Apostolakis, A.; Vera Rivera, Francisco Javier; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2010)
      Text en actes de congrés
      Accés obert
      Instruction-based or software-based self-testing (SBST) is a scalable functional testing paradigm that has gained increasing acceptance in testing of single-threaded uniprocessors. Recent computer architecture trends towards ...
    • On extending collaboration in virtual reality environments 

      Theoktisto Colmenares, Victor Arturo; Fairén González, Marta (2005-10)
      Report de recerca
      Accés obert
      We characterize the feature superset of Collaborative Virtual Reality Environments (CVREs) out of existing implementations, and derive a novel component framework for transforming standalone VR tools into full-fledged ...
    • Online prediction of applications cache utility 

      Moretó Planas, Miquel; Cazorla, Francisco; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
      Text en actes de congrés
      Accés obert
      General purpose architectures are designed to offer average high performance regardless of the particular application that is being run. Performance and power inefficiencies appear as a consequence for some programs. ...
    • Optimizing NANOS OpenMP for the IBM Cyclops multithreaded architecture 

      Ródenas Picó, David; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Almási, George; Cascaval, Calin; Castaños, José G.; Moreira, Jose E. (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Text en actes de congrés
      Accés obert
      In this paper, we present two approaches to improve the execution of OpenMP applications on the IBM Cyclops multithreaded architecture. Both solutions are independent and they are focused to obtain better performance through ...
    • Performance power efficiency and scalabity of asymmetric chip multiprocessors 

      Morad, Tomer Y.; Weiser, Uri C.; Kolodny, Avinoam; Valero Cortés, Mateo; Ayguadé Parra, Eduard (2006-01)
      Article
      Accés restringit per política de l'editorial
      This paper evaluates asymmetric cluster chip multiprocessor (ACCMP) architectures as a mechanism to achieve the highest performance for a given power budget. ACCMPs execute serial phases of multithreaded programs on large ...
    • QoS for high-performance SMT processors in embedded systems 

      Cazorla Almeida, Francisco Javier; Ramírez Bellido, Alejandro; Valero Cortés, Mateo; Knijnenburg, Peter M.W.; Sakellariou, Rizos; Fernández Garcia, Enrique (2004-07)
      Article
      Accés obert
      Although simultaneous multithreading processors provide a good cost-performance tradeoff, they exhibit unpredictable performance in real-time applications. We present a resource management scheme that eliminates a major ...
    • Quantifying the benefits of SPECint distant parallelism in simultaneous multithreading architectures 

      Ortega Fernández, Daniel; Martel Pérez, Iván; Krishnan, Venkata; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 1999)
      Text en actes de congrés
      Accés obert
      We exploit the existence of distant parallelism that future compilers could detect and characterise its performance under simultaneous multithreading architectures. By distant parallelism we mean parallelism that cannot ...
    • Runahead threads to improve SMT performance 

      Ramirez Garcia, Tanausú; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Text en actes de congrés
      Accés obert
      In this paper, we propose Runahead Threads (RaT) as a valuable solution for both reducing resource contention and exploiting memory-level parallelism in Simultaneous Multithreaded (SMT) processors. Our technique converts ...
    • Software-controlled priority characterization of POWER5 processor 

      Boneti, Carlos; Cazorla, Francisco; Gioiosa, Roberto; Buyuktosunoglu, Alper; Cher, Chen-Yong; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Text en actes de congrés
      Accés obert
      Due to the limitations of instruction-level parallelism, thread-level parallelism has become a popular way to improve processor performance. One example is the IBM POWER5TM processor, a two-context simultaneous-multithreaded ...
    • Static scheduling of the LU factorization with look-ahead on asymmetric multicore processors 

      Catalán Pallarés, Sandra; Herrero Zaragoza, José Ramón; Quintana Ortí, Enrique Salvador; Rodríguez Sánchez, Rafael (2018-08)
      Article
      Accés obert
      We analyze the benefits of look-ahead in the parallel execution of the LU factorization with partial pivoting (LUpp) in two distinct “asymmetric” multicore scenarios. The first one corresponds to an actual hardware-asymmetric ...