• A co-designed HW/SW approach to general purpose program acceleration using a programmable functional unit 

    Deb, Abhishek; Codina Viñas, Josep M.; González Colás, Antonio María (IEEE Press. Institute of Electrical and Electronics Engineers, 2011)
    Text en actes de congrés
    Accés restringit per política de l'editorial
    In this paper, we propose a novel programmable functional unit (PFU) to accelerate general purpose application execution on a modern out-of-order x86 processor in a complexity-effective way. Code is transformed and ...
  • AGAMOS: A graph-based approach to modulo scheduling for clustered microarchitectures 

    Aleta Ortega, Alexandre; Codina Viñas, Josep M.; Sánchez Navarro, F. Jesús; González Colás, Antonio María; Kaeli, D (2009-06)
    Article
    Accés obert
    This paper presents AGAMOS, a technique to modulo schedule loops on clustered microarchitectures. The proposed scheme uses a multilevel graph partitioning strategy to distribute the workload among clusters and reduces the ...
  • Boosting single-thread performance in multi-core systems through fine-grain multi-threading 

    Madriles Gimeno, Carles; López Muñoz, Pedro; Codina Viñas, Josep M.; Gibert Codina, Enric; Latorre Salinas, Fernando; Martínez Vicente, Alejandro; Martinez Morais, Raul; González Colás, Antonio María (ACM Press. Association for Computing Machinery, 2009-06)
    Text en actes de congrés
    Accés restringit per política de l'editorial
    Industry has shifted towards multi-core designs as we have hit the memory and power walls. However, single thread performance remains of paramount importance since some applications have limited thread-level parallelism ...
  • Graph-partitioning based instruction scheduling for clustered processors 

    Aleta Ortega, Alexandre; Codina Viñas, Josep M.; Sánchez Navarro, F. Jesús; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2001)
    Text en actes de congrés
    Accés obert
    This paper presents a novel scheme to schedule loops for clustered microarchitectures. The scheme is based on a preliminary cluster assignment phase implemented through graph partitioning techniques followed by a scheduling ...
  • Instruction replication for clustered microarchitectures 

    Aleta Ortega, Alexandre; Codina Viñas, Josep M.; González Colás, Antonio María; David, Kaeli (Institute of Electrical and Electronics Engineers (IEEE), 2003)
    Text en actes de congrés
    Accés obert
    This work presents a new compilation technique that uses instruction replication in order to reduce the number of communications executed on a clustered microarchitecture. For such architectures, the need to communicate ...