Ara es mostren els items 74-93 de 93

    • Scalability analysis of progressive alignment in a multicore 

      Isaza, Sebastian; Sánchez Castaño, Friman; Gaydadjiev, Georgi; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (IEEE Press. Institute of Electrical and Electronics Engineers, 2010)
      Text en actes de congrés
      Accés obert
      Sequence alignment is a fundamental instrument in Bioinformatics. In recent years, numerous proposals have been addressing the problem of accelerating this class of applications. This, due to the rapid growth of sequence ...
    • Scalability evaluation of a polymorphic register file: a CG case study 

      Ciobanu, Catalin; Martorell Bofill, Xavier; Kuzmanov, Georgi; Ramírez Bellido, Alejandro; Gaydadjiev, Georgi (Springer, 2011)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      We evaluate the scalability of a Polymorphic Register File using the Conjugate Gradient method as a case study. We focus on a heterogeneous multi-processor architecture, taking into consideration critical parameters such ...
    • Scalability of Macroblock-level parallelism for H.264 decoding 

      Álvarez Mesa, Mauricio; Ramírez Bellido, Alejandro; Azevedo, Arnaldo; Meenderinck, Cor; Juurlink, Ben; Valero Cortés, Mateo (IEEE Computer Society Publications, 2009-12-11)
      Text en actes de congrés
      Accés obert
      This paper investigates the scalability of MacroBlock(MB) level parallelization of the H.264 decoder for High Definition (HD) applications. The study includes three parts. First, a formal model for predicting the maximum ...
    • Scalability of parallel video decoding on heterogeneous manycore architectures 

      Álvarez Mesa, Mauricio; Cabarcas Jaramillo, Felipe; Ramírez Bellido, Alejandro; Meenderinck, Cor; Juurlink, Ben; Valero Cortés, Mateo (2011)
      Report de recerca
      Accés obert
      This paper presents an analysis of the scalability of the parallel video decoding on heterogeneous many core architectures. As benchmark, we use a highly parallel H.264/AVC video decoder that generates a large number of ...
    • Simulating whole supercomputer applications 

      González García, Juan; Casas, Marc; Giménez Lucas, Judit; Moretó Planas, Miquel; Ramírez Bellido, Alejandro; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2011-06)
      Article
      Accés restringit per política de l'editorial
      Detailed simulations of large scale message-passing interface parallel applications are extremely time consuming and resource intensive. A new methodology that combines signal processing and data mining techniques plus a ...
    • Software trace cache 

      Ramírez Bellido, Alejandro; Larriba Pey, Josep; Valero Cortés, Mateo (2005-01)
      Article
      Accés obert
      We explore the use of compiler optimizations, which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying hardware resources regardless of the specific details ...
    • Starsscheck: a tool to find errors in task-based parallel programs 

      Carpenter, Paul Matthew; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (Springer Verlag, 2010)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Star Superscalar is a task-based programming model. The programmer starts with an ordinary C program, and adds pragmas to mark functions as tasks, identifying their inputs and outputs. When the main thread reaches a task, ...
    • Supercomputing with commodity CPUs: are mobile SoCs ready for HPC? 

      Rajovic, Nikola; Carpenter, Paul Matthew; Gelado Fernandez, Isaac; Puzovic, Nikola; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      In the late 1990s, powerful economic forces led to the adoption of commodity desktop processors in high-performance computing. This transformation has been so effective that the June 2013 TOP500 list is still dominated by ...
    • Task management analysis on the CellBE 

      Rico Carro, Alejandro; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2008-09)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      There is a clear industrial trend towards chip multiprocessors (CMP) as the most power efficient way of further increasing performance. Heterogeneous CMP architectures take one more step along this power efficiency trend ...
    • Task superscalar: an out-of-order task pipeline 

      Etsion, Yoav; Cabarcas, Felipe; Rico Carro, Alejandro; Ramírez Bellido, Alejandro; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (IEEE Computer Society Publications, 2010)
      Text en actes de congrés
      Accés obert
      We present Task Superscalar, an abstraction of instruction-level out-of-order pipeline that operates at the tasklevel. Like ILP pipelines, which uncover parallelism in a sequential instruction stream, task superscalar ...
    • Techniques for enlarging instruction streams 

      Santana Jaria, Oliverio J.; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2005-03)
      Report de recerca
      Accés obert
      This work presents several techniques for enlarging instruction streams. We call stream to a sequence of instructions from the target of a taken branch to the next taken branch, potentially containing multiple basic blocks. ...
    • The abstract streaming machine: compile-time performance modelling of stream programs on heterogeneous multiprocessors 

      Carpenter, Paul Matthew; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (2009)
      Comunicació de congrés
      Accés restringit per política de l'editorial
      Stream programming offers a portable way for regular applications such as digital video, software radio, multimedia and 3D graphics to exploit a multiprocessor machine. The compiler maps a portable stream program onto the ...
    • The effect of code reordering on branch prediction 

      Ramírez Bellido, Alejandro; Larriba Pey, Josep; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2000)
      Text en actes de congrés
      Accés obert
      Branch prediction accuracy is a very important factor for superscalar processor performance. The ability to predict the outcome of a branch allows the processor to effectively use a large instruction window, and extract a ...
    • The Mont-Blanc prototype: an alternative approach for high-performance computing systems 

      Rajovic, Nikola; Ramírez Bellido, Alejandro; Rico, Alejandro; Mantovani, Filippo; Ruiz, Daniel; Villarubi, Oriol; Gómez, Constantino; Backes, Luna; Nieto, Diego; Servat, Harald; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Valero Cortés, Mateo; Adeniyi-Jones, Chris; Derradji, Said; Gloaguen, Hervé; Lanucara, Piero; Sanna, Nico; Mehaut, Jean-François; Pouget, Kevin; Videau, Brice; Boyer, Eric; Allalen, Momme; Auweter, Axel; Brayford, David; Tafani, Daniele; Brömmel, Dirk; Halver, René; Meinke, Jan H.; Beivide Palacio, Ramon; Benito, Mariano; Vallejo, Enrique (2016)
      Report de recerca
      Accés obert
      High-performance computing (HPC) is recognized as one of the pillars for further advance of science, industry, medicine, and education. Current HPC systems are being developed to overcome emerging challenges in order to ...
    • The MPsim simulation tool 

      Acosta Ojeda, Carmelo Alexis; Cazorla, Francisco; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2009)
      Report de recerca
      Accés obert
      In order to evaluate novel ideas, computer architects require simulation tools which model a target architecture. According to the specific accuracy requirements we find very specific simulators, which model a single ...
    • The SARC architecture 

      Gaydadjiev, Georgi; Isaza, Sebastian; Ramírez Bellido, Alejandro; Cabarcas, Felipe; Juurlink, Ben; Álvarez Mesa, Mauricio; Sánchez Castaño, Friman; Azevedo, Arnaldo; Meenderinck, Cor; Ciobanu, Catalin (2010-10)
      Article
      Accés obert
      The SARC architecture is composed of multiple processor types and a set of user-managed direct memory access (DMA) engines that let the runtime scheduler overlap data transfer and computation. The runtime system automatically ...
    • Thread to core assignment in SMT on-chip multiprocessors 

      Acosta Ojeda, Carmelo Alexis; Cazorla Almeida, Francisco Javier; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (IEEE Computer Society Publications, 2009)
      Text en actes de congrés
      Accés obert
      State-of-the-art high-performance processors like the IBM POWER5 and Intel i7 show a trend in industry towards on-chip Multiprocessors (CMP) involving Simultaneous Multithreading (SMT) in each core. In these processors, ...
    • Thread to core assignment in SMT on-chip multiprocessors 

      Acosta Ojeda, Carmelo Alexis; Cazorla Almeida, Francisco Javier; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2009)
      Article
      Accés obert
      State-of-the-art high-performance processors like the IBM POWER5 and Intel i7 show a trend in industry towards on-chip Multiprocessors (CMP) involving Simultaneous Multithreading (SMT) in each core. In these processors, ...
    • Trace cache redundancy: red and blue traces 

      Ramírez Bellido, Alejandro; Larriba Pey, Josep; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2000)
      Text en actes de congrés
      Accés obert
      The objective of this paper is to improve the use of the hardware resources of the trace cache mechanism, reducing the implementation cost with no performance degradation. We achieve that by eliminating the replication of ...
    • Trace-driven simulation of multithreaded applications 

      Rico Carro, Alejandro; Duran González, Alejandro; Cabarcas, Felipe; Etsion, Yoav; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2011)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Over the past few years, computer architecture research has moved towards execution-driven simulation, due to the inability of traces to capture timing-dependent thread execution interleaving. However, trace-driven simulation ...