Now showing items 1-20 of 285

    • A Bicriteria Simulated Annealing Algorithm for Scheduling Jobs on Parallel Machines with Sequence Dependent Setup Times 

      Persson, Rasmus (Universitat Politècnica de Catalunya, 2008-09)
      Master thesis (pre-Bologna period)
      Open Access
      The study considers the scheduling problem of identical parallel machines subject to minimization of the maximum completion time and the maximum tardiness expressed in a linear convex objective function. The maximum ...
    • A case for merging the ILP and DLP paradigms 

      Quintana Rodríguez, Francisca; Espasa Sans, Roger; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 1998)
      Conference report
      Open Access
      The goal of this paper is to show that instruction level parallelism (ILP) and data-level parallelism (DLP) can be merged in a single architecture to execute vectorizable code at a performance level that can not be achieved ...
    • A case study of hybrid dataflow and shared-memory programming models: Dependency-based parallel game engine 

      Gajinov, Vladimir; Eric, Igor; Stojanovic, Saa; Milutinovic, Veljko; Unsal, Osman Sabri; Ayguadé Parra, Eduard; Cristal Kestelman, Adrián (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Restricted access - publisher's policy
      Recently proposed hybrid dataflow and shared memory programming models combine these two underlying models in order to support a wider range of problems naturally. The effectiveness of such hybrid models for parallel ...
    • A complexity-effective simultaneous multithreading architecture 

      Acosta Ojeda, Carmelo Alexis; Falcón Samper, Ayose Jesús; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2005)
      Conference report
      Open Access
      Different applications may exhibit radically different behaviors and thus have very different requirements in terms of hardware support. In simultaneous multithreading (SMT) architectures, the hardware is shared among ...
    • A computational evaluation of constructive heuristics for the parallel blocking flow shop problem with sequence-dependent setup times 

      Ribas Vila, Immaculada; Companys Pascual, Ramón (2021-01-22)
      Article
      Open Access
      This paper deals with the problem of scheduling jobs in a parallel flow shop environment without buffers between machines and with sequence-dependent setup times in order to minimize the maximum completion time of jobs. ...
    • A content aware integer register file organization 

      González García, Rubén; Cristal Kestelman, Adrián; Ortega Fernández, Daniel; Veidenbaum, Alex; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Conference report
      Open Access
      A register file is a critical component of a modern superscalar processor. It has a large number of entries and read/write ports in order to enable high levels of instruction parallelism. As a result, the register file's ...
    • A cost-effective clustered architecture 

      Canal Corretger, Ramon; Parcerisa Bundó, Joan Manuel; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1999)
      Conference report
      Open Access
      In current superscalar processors, all floating-point resources are idle during the execution of integer programs. As previous works show, this problem can be alleviated if the floating-point cluster is extended to execute ...
    • A data flow language to develop high performance computing DSLs 

      Fernandez, Alejandro; Berltran, Vicenç; Mateo, Sergi; Patejko, Thomas; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2014)
      Conference report
      Restricted access - publisher's policy
      Developing complex scientific applications on high performance systems requires both domain knowledge and expertise in parallel and distributed programming models. In addition, modern high performance systems are heterogeneous, ...
    • A distributed processor state management architecture for large-window processors 

      González, Isidro; Galluzzi, Marco; Veidenbaum, Alexander V.; Ramírez, Marco Antonio; Cristal Kestelman, Adrián; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Conference report
      Open Access
      Processor architectures with large instruction windows have been proposed to expose more instruction-level parallelism (ILP) and increase performance. Some of the proposed architectures replace a re-order buffer (ROB) with ...
    • A flexible heterogeneous multi-core architecture 

      Pericàs Gleim, Miquel; Cristal Kestelman, Adrián; Cazorla, Francisco; González García, Rubén; Jiménez, Daniel A.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
      Conference report
      Open Access
      Multi-core processors naturally exploit thread-level parallelism (TLP). However, extracting instruction-level parallelism (ILP) from individual applications or threads is still a challenge as application mixes in this ...
    • A hardware runtime for task-based programming models 

      Tan, Xubin; Bosch, Jaume; Álvarez, Carlos; Jiménez González, Daniel; Ayguadé Parra, Eduard; Valero Cortés, Mateo (2019-09-01)
      Article
      Open Access
      Task-based programming models such as OpenMP 5.0 and OmpSs are simple to use and powerful enough to exploit task parallelism of applications over multicore, manycore and heterogeneous systems. However, their software-only ...
    • A highly scalable parallel implementation of H.264 

      Azevedo, Arnaldo; Juurlink, Ben; Meenderinck, Cor; Terechko, Andrei; Hoogerbrugge, Jan; Álvarez Mesa, Mauricio; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2011)
      Article
      Open Access
      Developing parallel applications that can harness and efficiently use future many-core architectures is the key challenge for scalable computing systems. We contribute to this challenge by presenting a parallel implementation ...
    • A low cost split-issue technique to improve performance of SMT clustered VLIW processors 

      Gupta, Manoj; Sánchez Carracedo, Fermín; Llosa Espuny, José Francisco (2010)
      Conference report
      Open Access
      Abstract—Very Long Instruction Word (VLIW) processors are a popular choice in embedded domain due to their hardware simplicity, low cost and low power consumption. Simultaneous MultiThreading (SMT) is a popular technique for ...
    • A low-complexity, high-performance fetch unit for simultaneous multithreading processors 

      Falcón Samper, Ayose Jesús; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Conference report
      Open Access
      Simultaneous multithreading (SMT) is an architectural technique that allows for the parallel execution of several threads simultaneously. Fetch performance has been identified as the most important bottleneck for SMT ...
    • A methodology for user-oriented scalability analysis 

      Royo Vallés, María Dolores; Valero García, Miguel; González Colás, Antonio María; Marí, Carme (Institute of Electrical and Electronics Engineers (IEEE), 1997)
      Conference report
      Open Access
      Scalability analysis provides information about the effectiveness of increasing the number of resources of a parallel system. Several methods have been proposed which use different approaches to provide this information. ...
    • A novel architecture for large windows processors 

      González, Isidro; Galluzzi, Marco; Veidenbaum, Alex; Ramírez, Marco Antonio; Cristal Kestelman, Adrián; Valero Cortés, Mateo (2007-11)
      External research report
      Open Access
      Several processor architectures with large instruction windows have been proposed. They improve performance by maintaining hundreds of instructions in flight to increase the level of instruction parallelism (ILP). Such ...
    • A parallel grid-based implementation for real-time processing of event log data of collaborative applications 

      Xhafa Xhafa, Fatos; Paniagua, Claudi; Barolli, Leonard; Caballé Llobet, Santiago (2010)
      Article
      Open Access
      Collaborative applications usually register user interaction in the form of semi-structured plain text event log data. Extracting and structuring of data is a prerequisite for later key processes such as the analysis of ...
    • A partial breadth-first execution model for prolog 

      Tubella Murgadas, Jordi; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1994)
      Conference report
      Open Access
      MEM (Multipath Execution Model) is a novel model for the execution of Prolog programs which combines a depth-first and breadth-first exploration of the search tree. The breadth-first search allows more than one path of the ...
    • A polymorphic register file for matrix operations 

      Ciobanu, Catalin; Kuzmanov, Georgi; Gaydadjiev, Georgi; Ramírez Bellido, Alejandro (IEEE Computer Society Publications, 2010)
      Conference report
      Open Access
      Previous vector architectures divided the available register file space in a fixed number of registers of equal sizes and shapes. We propose a register file organization which allows dynamic creation of a variable number ...
    • A quantitative assessment of thread-level speculation techniques 

      Marcuello Pascual, Pedro; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2000)
      Conference report
      Open Access
      Speculative thread-level parallelism has been recently proposed as an alternative source of parallelism that can boost the performance for applications where independent threads are hard to find. Several schemes to exploit ...