Ara es mostren els items 152-171 de 357

    • Identifying critical code sections in dataflow programming models 

      Subotic, Vladimir; Sancho, Jose Carlos; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2013)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      The years of practice in optimizing applications point that the major issue is focus - identifying the critical code section whose optimization would yield the highest overall speedup. While this issue is mainly solved for ...
    • Impact on performance of fused multiply-add units in aggressive VLIW architectures 

      López Álvarez, David; Llosa Espuny, José Francisco; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 1999)
      Text en actes de congrés
      Accés obert
      Loops are the main time consuming part of programs based on floating point computations. The performance of the loops is limited either by recurrences in the computation or by the resources offered by the architecture. ...
    • Implementation of systolic algorithms using pipelined functional units 

      Valero García, Miguel; Navarro Guerrero, Juan José; Llaberia Griñó, José M.; Valero Cortés, Mateo (1990)
      Text en actes de congrés
      Accés obert
      The authors present a method to implement systolic algorithms (SAs) using pipelined functional units (PFUs). This kind of unit makes it possible to improve the throughput of a processor because of the possibility of ...
    • Implementing Kilo-Instruction multiprocessors 

      Vallejo, Enrique; Galluzzi, Marco; Cristal Kestelman, Adrián; Vallejo, Fernando; Beivide Palacio, Ramon; Stenström, Per; Smith, James E.; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Text en actes de congrés
      Accés obert
      Multiprocessors are coming into wide-spread use in many application areas, yet there are a number of challenges to achieving a good tradeoff between complexity and performance. For example, while implementing memory coherence ...
    • Implementing kilo-instruction multiprocessors 

      Vallejo, Enrique; Galluzzi, Marco; Cristal Kestelman, Adrián; Vallejo, Fernando; Beivide Palacio, Ramon; Stenström, Per; Smith, James E.; Valero Cortés, Mateo (2005)
      Report de recerca
      Accés obert
      Multiprocessors are coming into wide-spread use in many application areas, yet there are a number of challenges to achieving a good tradeoff between complexity and performance. For example, while implementing memory coherence ...
    • Implicit transactional memory in chip multiprocessors 

      Galluzzi, Marco; Vallejo, Enrique; Cristal Kestelman, Adrián; Vallejo, Fernando; Beivide Palacio, Ramon; Stenström, Per; Smith, James E.; Valero Cortés, Mateo (2007-06)
      Report de recerca
      Accés obert
      Chip Multiprocessors (CMPs) are an efficient way of designing and use the huge amount of transistors on a chip. Different cores on a chip can compose a shared memory system with a very low-latency interconnect at a very ...
    • Implicit transactional memory in kilo-instruction multiprocessors 

      Galluzzi, Marco; Vallejo, Enrique; Cristal Kestelman, Adrián; Vallejo, Fernando; Beivide Palacio, Julio Ramon; Stenström, Per; Smith, James E.; Valero Cortés, Mateo (2007-06)
      Report de recerca
      Accés obert
      Although they have been the main server technology for many years, multiprocessors are undergoing a renaissance due to multi-core chips and the attractive scalability properties of combining a number of such multi-core ...
    • Implicit vs. explicit resource allocation in SMT processors 

      Cazorla Almeida, Francisco Javier; Knijnenburg, Peter M.W.; Sakellariou, Rizos; Fernandez Garcia, Enrique; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2004)
      Text en actes de congrés
      Accés obert
      In a simultaneous multithreaded (SMT) architecture, the front end of a superscalar is adapted in order to be able to fetch from several threads while the back end is shared among the threads. In this paper, we describe ...
    • Imposing coarse-grained reconfiguration to general purpose processors 

      Duric, Milovan; Stanic, Milan; Ratkovic, Ivan; Palomar Pérez, Óscar; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Valero Cortés, Mateo; Smith, Aaron (Institute of Electrical and Electronics Engineers (IEEE), 2015)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Mobile devices execute applications with diverse compute and performance demands. This paper proposes a general purpose processor that adapts the underlying hardware to a given workload. Existing mobile processors need to ...
    • Improving accuracy and speeding up document image classification through parallel systems 

      Ferrando Monsonís, Javier; Domínguez, Juan Luis; Torres Viñals, Jordi; García Fuentes, Raul; García Doménech, David; Garrido Miñambres, Daniel; Cortada, Jordi; Valero Cortés, Mateo (Springer, 2020)
      Text en actes de congrés
      Accés obert
      This paper presents a study showing the benefits of the EfficientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process ...
    • Improving predication efficiency through compaction/restoration of SIMD instructions 

      Barredo Ferreira, Adrián; Cebrián González, Juan Manuel; Moretó Planas, Miquel; Casas, Marc; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2020)
      Text en actes de congrés
      Accés obert
      Vector processors offer a wide range of unexplored opportunities to improve performance and energy efficiency. However, despite its potential, vector code generation and execution have significant challenges, the most ...
    • Increasing multicore system efficiency through intelligent bandwidth shifting 

      Jiménez, Víctor; Buyuktosunoglu, Alper; Bose, Pradip; O'Connell, Francis P.; Cazorla Almeida, Francisco Javier; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2015)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Memory bandwidth is a crucial resource in computing systems. Current CMP/SMT processors have a significant number of cores and they can run many threads concurrently. This large thread count adds high pressure to the memory ...
    • Increasing the number of strides for conflict-free vector access 

      Valero Cortés, Mateo; Lang, Tomas; Llaberia Griñó, José M.; Peiron Guàrdia, Montse; Ayguadé Parra, Eduard; Navarro Guerrero, Juan José (1992-05)
      Article
      Accés obert
      Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free vector access for some strides in vector processors with multi-module memories. In this paper, we ...
    • Initial results on fuzzy floating point computation for multimedia processors 

      Álvarez Martínez, Carlos; Corbal San Adrián, Jesús; Salamí San Juan, Esther; Valero Cortés, Mateo (2002-01)
      Article
      Accés obert
      During the recent years, the market of mid/low-end portable systems such as PDAs or mobile digital phones have experimented a revolution in both selling volume and features as handheld devices incorporate Multimedia ...
    • Instruction fetch architectures and code layout optimizations 

      Ramírez Bellido, Alejandro; Larriba Pey, Josep; Valero Cortés, Mateo (2001-11)
      Article
      Accés obert
      The design of higher performance processors has been following two major trends: increasing the pipeline depth to allow faster clock rates, and widening the pipeline to allow parallel execution of more instructions. Designing ...
    • Integrating dataflow abstractions into transactional memory 

      Gajinov, Vladimir; Milovanovic, Milos; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Valero Cortés, Mateo (2011)
      Text en actes de congrés
      Accés restringit per política de l'editorial
      Many concurrent programs require some form of conditional synchronization to coordinate the execution of different program tasks. Programming these algorithms using transactional memory (TM) often results in a high ...
    • Interconnection networks in petascale computer systems: A survey 

      Trobec, Roman; Vasiljevic, Radivoje; Tomasevic, Milo; Milutinovic, Veljko; Beivide Palacio, Ramon; Valero Cortés, Mateo (2016-11)
      Article
      Accés restringit per política de l'editorial
      This article provides background information about interconnection networks, an analysis of previous developments, and an overview of the state of the art. The main contribution of this article is to highlight the importance ...
    • Internet traffic and the behavior of processing workloads 

      Zilan, Ruken; Verdú Mulà, Javier; García Vidal, Jorge; Nemirovsky, Mario; Valero Cortés, Mateo (2009-06)
      Text en actes de congrés
      Accés obert
      Nowadays, the evolution of network services provided at the edge of Internet increases the requirements of network applications. Such applications result in complexities thus, the processors need to execute more complex ...
    • Introducing runahead threads 

      Ramírez García, Tanausu; Pajuelo González, Manuel Alejandro; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (2007-07)
      Report de recerca
      Accés obert
      Simultaneous Multithreading processors share their resources among multiple threads in order to improve performance. However, a resource control policy is needed to avoid resource conflicts and prevent some threads from ...
    • iQ: an efficient and flexible queue-based simulation framework 

      Roca, Damian; Nemirovsky, Daniel; Casas, Marc; Moretó Planas, Miquel; Valero Cortés, Mateo; Nemirovsky, Mario (Institute of Electrical and Electronics Engineers (IEEE), 2017)
      Text en actes de congrés
      Accés obert
      Conventional system simulators are readily used by computer architects to design and evaluate their processor designs. These simulators provide reasonable levels of accuracy and execution detail but suffer from long ...