Ara es mostren els items 1-20 de 97

    • A Proposal of modification in Thorup Zwick Routing algorithm for future Internet 

      Bathla, Yatish (Universitat Politècnica de Catalunya, 2012-07-23)
      Projecte Final de Màster Oficial
      Accés obert
      [ANGLÈS] The Thorup and Zwick routing scheme (usually referred as TZ scheme) is a very well-known compact routing scheme which aims at reducing the scalability problem of routing in Internet. In this work, we propose a ...
    • A sampling-based approach for automatic generation of microbenchmarks with a representative memory state 

      Bigas Soldevila, Arnau (Universitat Politècnica de Catalunya, 2021-06-28)
      Treball Final de Grau
      Accés obert
      A mesura que els processadors han esdevingut més complexos, i així ho ha fet també la tecnologia en què es fabriquen, el temps de simulació del processador físic ha incrementat considerablement. Per reduir el temps de ...
    • A software-hardware hybrid steering mechanism for clustered microarchitectures 

      Cai, Qiong; Codina Viñas, Josep M.; González González, José; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Text en actes de congrés
      Accés obert
      Clustered microarchitectures provide a promising paradigm to solve or alleviate the problems of increasing microprocessor complexity and wire delays. High- performance out-of-order processors rely on hardware-only steering ...
    • A symbolic emulator for shuffle synthesis on the NVIDIA PTX code 

      Matsumura, Kazuaki; García de Gonzalo, Simón; Peña Monferrer, Antonio José (Association for Computing Machinery (ACM), 2023)
      Text en actes de congrés
      Accés obert
      Various kinds of applications take advantage of GPUs through automation tools that attempt to automatically exploit the available performance of the GPU's parallel architecture. Directive-based programming models, such as ...
    • A tool for automatic evaluation of human translation quality within a mooc environment 

      Betanzos Atienza, Miguel (Universitat Politècnica de Catalunya, 2015-10)
      Projecte Final de Màster Oficial
      Accés obert
      Descripción del proceso de creación de un corpus de traducciones a través de un curso ofrecido en la plataforma openEdX, y su posterior análisis a fin de entrenar un modelo de evaluación para traducciones similares que ...
    • A toolchain to verify the parallelization of OmpSs-2 applications 

      Economo, Simone; Royuela Alcázar, Sara; Ayguadé Parra, Eduard; Beltran Querol, Vicenç (Springer, 2020)
      Text en actes de congrés
      Accés obert
      Programming models for task-based parallelization based on compile-time directives are very effective at uncovering the parallelism available in HPC applications. Despite that, the process of correctly annotating complex ...
    • A unified modulo scheduling and register allocation technique for clustered processors 

      Codina Viñas, Josep M.; Sánchez Navarro, F. Jesús; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2001)
      Text en actes de congrés
      Accés obert
      This work presents a modulo scheduling framework for clustered ILP processors that integrates the cluster assignment, instruction scheduling and register allocation steps in a single phase. This unified approach is more ...
    • ACOTES project: Advanced compiler technologies for embedded streaming 

      Duranton, M.; Munk, H.; Ayguadé Parra, Eduard; Bastoul, C.; Carpenter, Paul Matthew; Chamski, Z.; Cohen, A.; Cornero, M.; Dumont, P.; Pop, S.; Pop, A.; Ornstein, A.; Nuzman, D.; Miranda, C.; Martorell Bofill, Xavier; Lindwer, M.; Ladelsky, R.; Ferrer, Roger; Fellahi, M.; Pouchet, L. N; Zaks, A.; Shvadron, U.; Trifunovic, K.; Rohou, E.; Rosen, I.; Ramírez Bellido, Alejandro; Ródenas, D. (2011-04)
      Article
      Accés obert
      Streaming applications are built of data-driven, computational components, consuming and producing unbounded data streams. Streaming oriented systems have become dominant in a wide range of domains, including embedded ...
    • Advanced synchronization techniques for task-based runtime systems 

      Álvarez Robert, David; Sala Penadés, Kevin; Maroñas Bravo, Marcos; Roca Nonell, Aleix; Beltran Querol, Vicenç (Association for Computing Machinery (ACM), 2021)
      Text en actes de congrés
      Accés obert
      Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient implementation that scales well with small ...
    • Align and distribute-based linear loop transformations 

      Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Springer, 1993)
      Text en actes de congrés
      Accés obert
      In this paper we generalize the framework of linear loop transformations in the sense that loop alignment is considered as a new component in the transformation process. The aim is to match the structure of loop nests with ...
    • Analyzing data locality in numeric applications 

      Sánchez Navarro, F. Jesús; González Colás, Antonio María (2000-07)
      Article
      Accés obert
      In this article, we introduce SPLAT (Static and Profiled Data Locality Analysis Tool). The tool's purpose is to provide a fast study of memory behavior without the necessity of a costly memory simulator. SPLAT consists of ...
    • Applying interposition techniques for performance analysis of OPENMP parallel applications 

      González Tallada, Marc; Serra, Albert; Martorell Bofill, Xavier; Oliver Segura, José; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Navarro, Nacho (Institute of Electrical and Electronics Engineers (IEEE), 2000)
      Text en actes de congrés
      Accés obert
      Tuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may arise. We believe that ...
    • Assisting static compiler vectorization with a speculative dynamic vectorizer in an HW/SW codesigned environment 

      Kumar, Rakesh; Martínez, Alejandro; González Colás, Antonio María (2016-01-01)
      Article
      Accés obert
      Compiler-based static vectorization is used widely to extract data-level parallelism from computation-intensive applications. Static vectorization is very effective in vectorizing traditional array-based applications. ...
    • Author retrospective for "Software trace cache" 

      Ramírez Bellido, Alejandro; Falcón Samper, Ayose Jesus; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2014)
      Text en actes de congrés
      Accés obert
      In superscalar processors, capable of issuing and executing multiple instructions per cycle, fetch performance represents an upper bound to the overall processor performance. Unless there is some form of instruction re-use ...
    • Automatic evaluation of top-down predictive parsing 

      Creus, Carles; Fernández Durán, Pau; Godoy, Guillem; Mamano, Nil (2016-04-01)
      Report de recerca
      Accés obert
      We develop efficient methods to check whether two given Context-Free Grammars (CFGs) are transformed into parsers that recognize the same language and construct the same Abstract Syntax Trees (ASTs) for each input. In this ...
    • Automatic safe data reuse detection for the WCET analysis of systems with data caches 

      Segarra Flor, Juan; Cortadella, Jordi; Gran Tejero, Rubén; Viñals Yúfera, Victor (Institute of Electrical and Electronics Engineers (IEEE), 2020-10-19)
      Article
      Accés obert
      Worst-case execution time (WCET) analysis of systems with data caches is one of the key challenges in real-time systems. Caches exploit the inherent reuse properties of programs, temporarily storing certain memory contents ...
    • Automatic translation of programs for evaluation of execution times 

      Martín Brualla, Ricardo (Universitat Politècnica de Catalunya, 2011-12-23)
      Projecte/Treball Final de Carrera
      Accés obert
      Castellano: Este proyecto persigue la traduccion automática de programas en un subconjunto de C++ a otros lenguajes de programacion para así poder estimar mejor los límites de tiempo en jueces en línea.
    • Benchmarking of state-of-the-art HPC clusters with a production CFD code 

      Banchelli Gracia, Fabio; Garcia Gasulla, Marta; Houzeaux, Guillaume; Mantovani, Filippo (Association for Computing Machinery (ACM), 2020)
      Text en actes de congrés
      Accés obert
      Computing technologies populating high-performance computing (HPC) clusters are getting more and more diverse, offering a wide range of architectural features. As a consequence, efficient programming of such platforms ...
    • Binary Redundancy Elimination 

      Fernández Gómez, Manuel (Universitat Politècnica de Catalunya, 2005-04-13)
      Tesi
      Accés obert
      Dos de las limitaciones de rendimiento más importantes en los procesadores de hoy en día provienen de las operaciones de memoria y de las dependencias de control. Para resolver estos problemas, las memorias cache y los ...
    • CellMT: A cooperative multithreading library for the Cell/B.E. 

      Beltran Querol, Vicenç; Carrera Pérez, David; Torres Viñals, Jordi; Ayguadé Parra, Eduard (IEEE Computer Society Publications, 2009-12-16)
      Text en actes de congrés
      Accés obert
      The Cell BE processor has proved that heterogeneous multi-core systems can provide a huge computational power with high efficiency for a wide range of applications. The simple design of the computational units and the use ...