Now showing items 1-20 of 93

    • A complexity-effective simultaneous multithreading architecture 

      Acosta Ojeda, Carmelo Alexis; Falcón Samper, Ayose Jesús; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Conference report
      Open Access
      Different applications may exhibit radically different behaviors and thus have very different requirements in terms of hardware support. In simultaneous multithreading (SMT) architectures, the hardware is shared among ...
    • A highly scalable parallel implementation of H.264 

      Azevedo, Arnaldo; Juurlink, Ben; Meenderinck, Cor; Terechko, Andrei; Hoogerbrugge, Jan; Álvarez Mesa, Mauricio; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2011)
      Article
      Open Access
      Developing parallel applications that can harness and efficiently use future many-core architectures is the key challenge for scalable computing systems. We contribute to this challenge by presenting a parallel implementation ...
    • A low-complexity, high-performance fetch unit for simultaneous multithreading processors 

      Falcón Samper, Ayose Jesús; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Conference report
      Open Access
      Simultaneous multithreading (SMT) is an architectural technique that allows for the parallel execution of several threads simultaneously. Fetch performance has been identified as the most important bottleneck for SMT ...
    • A module-based cell processor simulator 

      Cabarcas Jaramillo, Felipe; Rico Carro, Alejandro; Rodenas, David; Martorell Bofill, Xavier; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC), 2006)
      Conference lecture
      Open Access
      An interesting design alternative to replication-based chip multiprocessors is to create heterogeneous chip multiprocessors composed of several different cores, with one or more of them running the operating system and ...
    • A performance characterization of high definition digital video decoding using H.264/AVC 

      Álvarez Mesa, Mauricio; Salamí San Juan, Esther; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Conference report
      Open Access
      H.264/AVC is a new international video coding standard that provides higher coding efficiency with respect to previous standards at the expense of a higher computational complexity. The complexity is even higher when ...
    • A performance perspective on energy efficient HPC links 

      Saravanan, Karthikeyan P.; Carpenter, Paul; Ramírez Bellido, Alejandro (Association for Computing Machinery (ACM), 2014)
      Conference lecture
      Open Access
      Energy costs are an increasing part of the total cost of ownership of HPC systems. As HPC systems become increasingly energy proportional in an effort to reduce energy costs, interconnect links stand out for their inefficiency. ...
    • A polymorphic register file for matrix operations 

      Ciobanu, Catalin; Kuzmanov, Georgi; Gaydadjiev, Georgi; Ramírez Bellido, Alejandro (IEEE Computer Society Publications, 2010)
      Conference report
      Open Access
      Previous vector architectures divided the available register file space in a fixed number of registers of equal sizes and shapes. We propose a register file organization which allows dynamic creation of a variable number ...
    • A streaming machine description and programming model 

      Carpenter, Paul; Ródenas Picó, David; Martorell Bofill, Xavier; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (2007-07)
      Article
      Restricted access - publisher's policy
      In this paper we present the initial development of a streaming environment based on a programming model and machine description. The stream programming model consists of an extension to the C language and it’s translation ...
    • ACOTES project: Advanced compiler technologies for embedded streaming 

      Duranton, M.; Munk, H.; Ayguadé Parra, Eduard; Bastoul, C.; Carpenter, Paul; Chamski, Z.; Cohen, A.; Cornero, M.; Dumont, P.; Pop, S.; Pop, A.; Ornstein, A.; Nuzman, D.; Miranda, C.; Martorell Bofill, Xavier; Lindwer, M.; Ladelsky, R.; Ferrer, Roger; Fellahi, M.; Pouchet, L. N; Zaks, A.; Shvadron, U.; Trifunovic, K.; Rohou, E.; Rosen, I.; Ramírez Bellido, Alejandro; Ródenas, D. (2011-04)
      Article
      Open Access
      Streaming applications are built of data-driven, computational components, consuming and producing unbounded data streams. Streaming oriented systems have become dominant in a wide range of domains, including embedded ...
    • Archexplorer for automatic design space exploration 

      Desmet, V.; Girbal, Sylvain; Ramírez Bellido, Alejandro; Temam, Olivier; Vega, Augusto (2010-09-09)
      Article
      Open Access
      Growing architectural complexity and stringent time-to-market constraints suggest the need to move architecture design beyond parametric exploration to structural exploration. ArchExplorer is a Web-based permanent and open ...
    • Architectural support for real-time task scheduling in SMT processors 

      Cazorla Almeida, Francisco Javier; Knijnenburg, Peter M.W.; Sakellariou, Rizos; Fernández, Enrique; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2005)
      External research report
      Open Access
      In Simultaneous Multithreaded (SMT) architectures most hardware resources are shared between threads. This provides a good cost/performance trade-off which renders these architectures suitable for use in embedded systems. ...
    • Author retrospective for "Software trace cache" 

      Ramírez Bellido, Alejandro; Falcón Samper, Ayose Jesus; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2014)
      Conference report
      Open Access
      In superscalar processors, capable of issuing and executing multiple instructions per cycle, fetch performance represents an upper bound to the overall processor performance. Unless there is some form of instruction re-use ...
    • Better branch prediction through prophet/critic hybrids 

      Falcón Samper, Ayose Jesús; Stark, Jared; Ramírez Bellido, Alejandro; Lai, Konrad; Valero Cortés, Mateo (2005-01)
      Article
      Open Access
      The prophet/critic hybrid conditional branch predictor has two component predictors. The prophet uses a branch's history to predict its direction. We call this prediction and the ones for branches following it the branch ...
    • Branch classification to control instruction fetch in simultaneous multithreaded architectures 

      Knijnenburg, Peter M.W.; Ramírez Bellido, Alejandro; Latorre Salinas, Fernando; Larriba Pey, Josep; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2002)
      Conference report
      Open Access
      In simultaneous multithreaded architectures many separate threads are running concurrently, sharing processor resources, thereby realizing a high utilization rate of the available hardware. However, this also implies that ...
    • Buffer sizing for self-timed stream programs on heterogeneous distributed memory multiprocessors 

      Carpenter, Paul; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (Springer Verlag, 2010)
      Conference report
      Restricted access - publisher's policy
      Stream programming is a promising way to expose concurrency to the compiler. A stream program is built from kernels that communicate only via point-to-point streams. The stream compiler statically allocates these kernels ...
    • CellSim: a validated modular heterogeneous multiprocessor simulator 

      Cabarcas Jaramillo, Felipe; Rico Carro, Alejandro; Ródenas Picó, David; Martorell Bofill, Xavier; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (Thomson Editores Spain, 2007)
      Conference report
      Open Access
      As the number of transistors on a chip continues increasing the power consumption has become the most important constraint in processors design. Therefore, to increase performance, computer architects have decided to use ...
    • Code layout optimizations for transaction processing workloads 

      Ramírez Bellido, Alejandro; Barroso, Luiz A; Gharachorloo, Kourosh; Cohn, Robert; Larriba Pey, Josep; Lowney, P. Geoffrey; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2001)
      Conference report
      Open Access
      Commercial applications such as databases and Web servers constitute the most important market segment for high-performance servers. Among these applications, on-line transaction processing (OLTP) workloads provide a ...
    • COMalaWEB: plataforma basada en noves tecnologies aplicades a la docència 

      Fernández Rubio, Juan Antonio; Fernández Prades, Carlos; Ramírez Bellido, Alejandro; Cabrera-Bean, Margarita; Pomar Berry, Christian (2005-02)
      Conference report
      Open Access
      Les possibilitats que ens ofereixen les noves tecnologies de la informació aplicades a l’àmbit de la docència és un tema encara no prou ben explotat. Aquest projecte pretén investigar aquests conceptes mitjançant la creació ...
    • Comparing last-level cache designs for CMP architectures 

      Vega, Augusto; Rico Carro, Alejandro; Cabarcas, Felipe; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2010)
      Conference report
      Restricted access - publisher's policy
      The emergence of hardware accelerators, such as graphics processing units (GPUs), has challenged the interaction between processing elements (PEs) and main memory. In architectures like the Cell/B.E. or GPUs, the PEs ...
    • CUsched: multiprogrammed workload scheduling on GPU architectures 

      Tanasic, Ivan; Gelado Fernandez, Isaac; Cabezas, Javier; Navarro, Nacho; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2013)
      External research report
      Open Access
      Graphic Processing Units (GPUs) are currently widely used in High Performance Computing (HPC) applications to speed-up the execution of massively-parallel codes. GPUs are well-suited for such HPC environments because ...