Now showing items 1-17 of 17

    • Author retrospective for "Software trace cache" 

      Ramírez Bellido, Alejandro; Falcón Samper, Ayose Jesus; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2014)
      Conference report
      Open Access
      In superscalar processors, capable of issuing and executing multiple instructions per cycle, fetch performance represents an upper bound to the overall processor performance. Unless there is some form of instruction re-use ...
    • Code semantic-aware runahead threads 

      Ramírez García, Tanausu; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (2009-09)
      Conference report
      Restricted access - publisher's policy
      Memory-intensive threads can hoard shared re- sources without making progress on a multithreading processor (SMT), thereby hindering the overall system performance. A recent promising solution to overcome this important ...
    • DIA: A complexity-effective decoding architecture 

      Santana Jaria, Oliverio J.; Falcón Samper, Ayose Jesus; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2009-04)
      Article
      Open Access
      Fast instruction decoding is a true challenge for the design of CISC microprocessors implementing variable-length instructions. A well-known solution to overcome this problem is caching decoded instructions in a hardware ...
    • Enlarging instruction streams 

      Santana Jaria, Oliverio J.; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2007-10)
      Article
      Open Access
      The stream fetch engine is a high-performance fetch architecture based on the concept of an instruction stream. We call a sequence of instructions from the target of a taken branch to the next taken branch, potentially ...
    • FAME: FAirly MEasuring multithreaded architectures 

      Vera, Javier; Cazorla, Francisco; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Fernandez Garcia, Enrique; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
      Conference report
      Open Access
      Nowadays, multithreaded architectures are becoming more and more popular. In order to evaluate their behavior, several methodologies and metrics have been proposed. A methodology defines when the measurements of a given ...
    • Fetching instruction streams 

      Ramírez Bellido, Alejandro; Santana Jaria, Oliverio J.; Larriba Pey, Josep; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2002)
      Conference report
      Open Access
      Fetch performance is a very important factor because it effectively limits the overall processor performance. However there is little performance advantage in increasing front-end performance beyond what the back-end can ...
    • Introducing runahead threads 

      Ramírez García, Tanausu; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (2007-07)
      External research report
      Open Access
      Simultaneous Multithreading processors share their resources among multiple threads in order to improve performance. However, a resource control policy is needed to avoid resource conflicts and prevent some threads from ...
    • Kilo-instruction processors: overcoming the memory wall 

      Cristal Kestelman, Adrián; Santana Jaria, Oliverio J.; Cazorla, Francisco; Galluzzi, Marco; Ramirez Garcia, Tanausú; Pericas, Miquel; Valero Cortés, Mateo (2005-05)
      Article
      Open Access
      Historically, advances in integrated circuit technology have driven improvements in processor microarchitecture and led to todays microprocessors with sophisticated pipelines operating at very high clock frequencies. ...
    • Latency tolerant branch predictors 

      Santana Jaria, Oliverio J.; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (IEEE Press, 2003)
      Conference report
      Open Access
      The access latency of branch predictors is a well known problem of fetch engine design. Prediction overriding techniques are commonly accepted to overcome this problem. However, prediction overriding requires a complex ...
    • Maximizing multithreaded multicore architectures through thread migrations 

      Acosta Ojeda, Carmelo Alexis; Cazorla Almeida, Francisco Javier; Santana Jaria, Oliverio J.; Falcón Samper, Ayose Jesús; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2009)
      External research report
      Open Access
      Heterogeneity in general-purpose workloads often end up in non optimal per-thread hardware resource usage. The current trend towards multicore architectures, containing several multithreaded cores, increases the need of a ...
    • On the problem of evaluating the performance of multiprogrammed workloads 

      Cazorla, Francisco; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Fernández Rodríguez, José Enrique; Valero Cortés, Mateo (2010-12)
      Article
      Open Access
      Multithreaded architectures are becoming more and more popular. In order to evaluate their behavior, several methodologies and metrics have been proposed. A methodology defines when the measurements for a given workload ...
    • Predicting multiple streams per cycle 

      Santana Jaria, Oliverio J.; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2005)
      External research report
      Open Access
      The next stream predictor is an accurate branch predictor that provides stream level sequencing. Every stream prediction contains a full stream of instructions, that is, a sequence of instructions from the target of a taken ...
    • Reducing fetch architecture complexity using procedure inlining 

      Santana Jaria, Oliverio J.; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Conference report
      Open Access
      Fetch engine performance is seriously limited by the branch prediction table access latency. This fact has lead to the development of hardware mechanisms, like prediction overriding, aimed to tolerate this latency. However, ...
    • ROB-free architecture proposal 

      González, Isidro; Galluzzi, Marco; Cristal Kestelman, Adrián; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (2007-09)
      External research report
      Open Access
      Modern processors improve performance by taking advantage of the instruction level parallelism (ILP) by means of allowing hundreds of instructions in flight. However, they still have to face an important source of degradation ...
    • Runahead threads to improve SMT performance 

      Ramirez Garcia, Tanausú; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Conference report
      Open Access
      In this paper, we propose Runahead Threads (RaT) as a valuable solution for both reducing resource contention and exploiting memory-level parallelism in Simultaneous Multithreaded (SMT) processors. Our technique converts ...
    • Runahead threads: reducing resource contention in SMT processors 

      Ramírez García, Tanausu; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
      Conference lecture
      Open Access
      In this work, we propose Runahead threads as a valuable solution for both exploiting memory-level parallelism and reducing resource contention in simultaneous multithreaded processors.
    • Techniques for enlarging instruction streams 

      Santana Jaria, Oliverio J.; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2005-03)
      External research report
      Open Access
      This work presents several techniques for enlarging instruction streams. We call stream to a sequence of instructions from the target of a taken branch to the next taken branch, potentially containing multiple basic blocks. ...