Ara es mostren els items 136-155 de 357

    • General purpose task-dependence management hardware for task-based dataflow programming models 

      Tan, Xubin; Bosch, Jaume; Vidal, Miquel; Alvarez, Carlos; Jimenez-Gonzalez, Daniel; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Task-based programming models such as OpenMP, IntelTBB and OmpSs offer the possibility of expressing dependences among tasks to drive their execution at runtime. Managing these dependences introduces noticeable overheads ...
    • Generating a periodic pattern for VLIW 

      Barrado Muxí, Cristina; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Universidad de Málaga, 1995)
      Text en actes de congrés
      Accés obert
      Fine-grain parallelism available in VLIW and superscalar processors can be mainly exploited in computational intensive loops. Aggressive scheduling techniques are required to fully exploit this parallelism. In this paper ...
    • Global misrouting policies in two-level hierarchical networks 

      Garcia, Marina; Vallejo, Enrique; Beivide Palacio, Julio Ramón; Odriozola, Miguel; Camarero Coterillo, Cristobal; Valero Cortés, Mateo; Labarta Mancho, Jesús José; Rodríguez, Germán (2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Dragonfly networks are composed of interconnected groups of routers. Adaptive routing allows packets to be forwarded minimally or non-minimally adapting to the traffic conditions in the network. While minimal routing sends ...
    • GMT: Enabling easy development and efficient execution of irregular applications on commodity clusters 

      Morari, Alessandro; Villa, Oreste; Tumeo, Antonino; Chavarria Miranda, Daniel; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2013)
      Comunicació de congrés
      Accés obert
      In this poster we introduce GMT (Global Memory and Threading library), a custom runtime library that enables efficient execution of irregular applications on commodity clusters. GMT only requires a cluster with x86 nodes ...
    • Graph partitioning applied to DAG scheduling to reduce NUMA effects 

      Sánchez Barrera, Isaac; Casas, Marc; Moretó Planas, Miquel; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2018)
      Comunicació de congrés
      Accés obert
      The complexity of shared memory systems is becoming more relevant as the number of memory domains increases, with different access latencies and bandwidth rates depending on the proximity between the cores and the devices ...
    • Hardware scheduling algorithms for asymmetric single-ISA CMPs 

      Markovic, Nikola; Nemirovsky, Daniel; Unsal, Osman Sabri; Valero Cortés, Mateo; Cristal Kestelman, Adrián (Barcelona Supercomputing Center, 2015-05-05)
      Text en actes de congrés
      Accés obert
      As thread level parallelism in applications has continued to expand, so has research in chip multi-core processors. Since more and more applications become multi-threaded we expect to find a growing number of threads ...
    • Hardware schemes for early register release 

      Monreal Arnal, Teresa; Viñals Yufera, Víctor; González Colás, Antonio María; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2002)
      Text en actes de congrés
      Accés obert
      Register files are becoming one of the critical components of current out-of-order processors in terms of delay and power consumption, since their potential to exploit instruction-level parallelism is quite related to the ...
    • Hardware transactional memory with software-defined conflicts 

      Titos Gil, Rubén; Acacio, Manuel E.; García, José M.; Harris, Tim; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Hur, Ibrahim; Valero Cortés, Mateo (2010)
      Text en actes de congrés
      Accés obert
      In this paper we propose conflict-defined blocks, a programming language construct that allows programmers to change the concept of conflict from one transaction to another, or even throughout the course of the same ...
    • HD-VideoBench: A benchmark for evaluating high definition digital video applications 

      Álvarez Mesa, Mauricio; Salamí San Juan, Esther; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
      Text en actes de congrés
      Accés obert
      HD-VideoBench is a benchmark devoted to high definition (HD) digital video processing. It includes a set of video encoders and decoders (Codecs) for the MPEG-2, MPEG-4 and H.264 video standards. The applications were ...
    • Heuristics for register-constrained software pipelining 

      Llosa Espuny, José Francisco; Valero Cortés, Mateo; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 1996)
      Text en actes de congrés
      Accés obert
      Software Pipelining is a loop scheduling technique that extracts parallelism from loops by overlapping the execution of several consecutive iterations. There has been a significant effort to produce throughput-optimal ...
    • Hierarchical clustered register file organization for VLIW processors 

      Zalamea León, Francisco Javier; Llosa Espuny, José Francisco; Ayguadé Parra, Eduard; Valero Cortés, Mateo (IEEE Computer Society, 2003)
      Text en actes de congrés
      Accés obert
      Technology projections indicate that wire delays will become one of the biggest constraints in future microprocessor designs. To avoid long wire delays and therefore long cycle times, processor cores must be partitioned ...
    • How can we improve energy efficiency through user-directed vectorization and task-based parallelization? 

      Caminal, Helena; Caballero, Diego; Cebrián, Juan M.; Ferrer, Roger; Casas, Marc; Moretó Planas, Miquel; Martorell Bofill, Xavier; Valero Cortés, Mateo (Barcelona Supercomputing Center, 2015-05-05)
      Text en actes de congrés
      Accés obert
      Heterogeneity, parallelization and vectorization are key techniques to improve the performance and energy efficiency of modern computing systems. However, programming and maintaining code for these architectures poses a ...
    • HPC system software for regular and irregular parallel applications 

      Morari, Alessandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The upcoming generation of system software for High Performance Computing is expected to provide a richer set of functionalities without compromising application performance. This Ph.D. thesis addresses the problem of ...
    • Hybrid transactional memory to accelerate safe lock-based transactions 

      Vallejo, Enrique; Harris, Tim; Cristal Kestelman, Adrián; Unsal, Osman Sabri; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2008)
      Text en actes de congrés
      Accés obert
      To reduce the overhead of Software Transactional Memory (STM) there are many recent proposals to build hybrid systems that use architectural support either to accelerate parts of a particular STM algorithm (Ha-TM), or ...
    • Hybrid transactional memory with pessimistic concurrency control 

      Vallejo, Enrique; Sanyal, Sutirtha; Harris, Tim; Vallejo, Fernando; Beivide, Ramón; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo (2011-06)
      Article
      Accés restringit per política de l'editorial
      Transactional Memory (TM) intends to simplify the design and implementation of the shared-memory data structures used in parallel software. Many Software TM systems are based on writer-locks to protect the data being ...
    • Hypernode reduction modulo scheduling 

      Llosa Espuny, José Francisco; Valero Cortés, Mateo; Ayguadé Parra, Eduard; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 1995)
      Text en actes de congrés
      Accés obert
      Software pipelining is a loop scheduling technique that extracts parallelism from loops by overlapping the execution of several consecutive iterations. Most prior scheduling research has focused on achieving minimum execution ...
    • Identifying critical code sections in dataflow programming models 

      Subotic, Vladimir; Sancho, Jose Carlos; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The years of practice in optimizing applications point that the major issue is focus - identifying the critical code section whose optimization would yield the highest overall speedup. While this issue is mainly solved for ...
    • Impact on performance of fused multiply-add units in aggressive VLIW architectures 

      López Álvarez, David; Llosa Espuny, José Francisco; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 1999)
      Text en actes de congrés
      Accés obert
      Loops are the main time consuming part of programs based on floating point computations. The performance of the loops is limited either by recurrences in the computation or by the resources offered by the architecture. ...
    • Implementation of systolic algorithms using pipelined functional units 

      Valero García, Miguel; Navarro Guerrero, Juan José; Llaberia Griñó, José M.; Valero Cortés, Mateo (1990)
      Text en actes de congrés
      Accés obert
      The authors present a method to implement systolic algorithms (SAs) using pipelined functional units (PFUs). This kind of unit makes it possible to improve the throughput of a processor because of the possibility of ...
    • Implementing Kilo-Instruction multiprocessors 

      Vallejo, Enrique; Galluzzi, Marco; Cristal Kestelman, Adrián; Vallejo, Fernando; Beivide Palacio, Ramon; Stenström, Per; Smith, James E.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Text en actes de congrés
      Accés obert
      Multiprocessors are coming into wide-spread use in many application areas, yet there are a number of challenges to achieving a good tradeoff between complexity and performance. For example, while implementing memory coherence ...