Now showing items 1-20 of 82

    • A sampling-based approach for automatic generation of microbenchmarks with a representative memory state 

      Bigas Soldevila, Arnau (Universitat Politècnica de Catalunya, 2021-06-28)
      Bachelor thesis
      Open Access
      A mesura que els processadors han esdevingut més complexos, i així ho ha fet també la tecnologia en què es fabriquen, el temps de simulació del processador físic ha incrementat considerablement. Per reduir el temps de ...
    • A software-hardware hybrid steering mechanism for clustered microarchitectures 

      Cai, Qiong; Codina Viñas, Josep M.; González González, José; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2008)
      Conference report
      Open Access
      Clustered microarchitectures provide a promising paradigm to solve or alleviate the problems of increasing microprocessor complexity and wire delays. High- performance out-of-order processors rely on hardware-only steering ...
    • A tool for automatic evaluation of human translation quality within a mooc environment 

      Betanzos Atienza, Miguel (Universitat Politècnica de Catalunya, 2015-10)
      Master thesis
      Open Access
      Descripción del proceso de creación de un corpus de traducciones a través de un curso ofrecido en la plataforma openEdX, y su posterior análisis a fin de entrenar un modelo de evaluación para traducciones similares que ...
    • A toolchain to verify the parallelization of OmpSs-2 applications 

      Economo, Simone; Royuela Alcázar, Sara; Ayguadé Parra, Eduard; Beltran Querol, Vicenç (Springer, 2020)
      Conference report
      Open Access
      Programming models for task-based parallelization based on compile-time directives are very effective at uncovering the parallelism available in HPC applications. Despite that, the process of correctly annotating complex ...
    • A unified modulo scheduling and register allocation technique for clustered processors 

      Codina Viñas, Josep M.; Sánchez Navarro, F. Jesús; González Colás, Antonio María (Institute of Electrical and Electronics Engineers (IEEE), 2001)
      Conference report
      Open Access
      This work presents a modulo scheduling framework for clustered ILP processors that integrates the cluster assignment, instruction scheduling and register allocation steps in a single phase. This unified approach is more ...
    • ACOTES project: Advanced compiler technologies for embedded streaming 

      Duranton, M.; Munk, H.; Ayguadé Parra, Eduard; Bastoul, C.; Carpenter, Paul; Chamski, Z.; Cohen, A.; Cornero, M.; Dumont, P.; Pop, S.; Pop, A.; Ornstein, A.; Nuzman, D.; Miranda, C.; Martorell Bofill, Xavier; Lindwer, M.; Ladelsky, R.; Ferrer, Roger; Fellahi, M.; Pouchet, L. N; Zaks, A.; Shvadron, U.; Trifunovic, K.; Rohou, E.; Rosen, I.; Ramírez Bellido, Alejandro; Ródenas, D. (2011-04)
      Article
      Open Access
      Streaming applications are built of data-driven, computational components, consuming and producing unbounded data streams. Streaming oriented systems have become dominant in a wide range of domains, including embedded ...
    • Align and distribute-based linear loop transformations 

      Torres Viñals, Jordi; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Springer, 1993)
      Conference report
      Open Access
      In this paper we generalize the framework of linear loop transformations in the sense that loop alignment is considered as a new component in the transformation process. The aim is to match the structure of loop nests with ...
    • Analyzing data locality in numeric applications 

      Sánchez Navarro, F. Jesús; González Colás, Antonio María (2000-07)
      Article
      Open Access
      In this article, we introduce SPLAT (Static and Profiled Data Locality Analysis Tool). The tool's purpose is to provide a fast study of memory behavior without the necessity of a costly memory simulator. SPLAT consists of ...
    • Applying interposition techniques for performance analysis of OPENMP parallel applications 

      González Tallada, Marc; Serra, Albert; Martorell Bofill, Xavier; Oliver Segura, José; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Navarro, Nacho (Institute of Electrical and Electronics Engineers (IEEE), 2000)
      Conference report
      Open Access
      Tuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may arise. We believe that ...
    • Assisting static compiler vectorization with a speculative dynamic vectorizer in an HW/SW codesigned environment 

      Kumar, Rakesh; Martínez, Alejandro; González Colás, Antonio María (2016-01-01)
      Article
      Open Access
      Compiler-based static vectorization is used widely to extract data-level parallelism from computation-intensive applications. Static vectorization is very effective in vectorizing traditional array-based applications. ...
    • Author retrospective for "Software trace cache" 

      Ramírez Bellido, Alejandro; Falcón Samper, Ayose Jesus; Santana Jaria, Oliverio J.; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2014)
      Conference report
      Open Access
      In superscalar processors, capable of issuing and executing multiple instructions per cycle, fetch performance represents an upper bound to the overall processor performance. Unless there is some form of instruction re-use ...
    • Automatic evaluation of top-down predictive parsing 

      Creus, Carles; Fernández Durán, Pau; Godoy, Guillem; Mamano, Nil (2016-04-01)
      External research report
      Open Access
      We develop efficient methods to check whether two given Context-Free Grammars (CFGs) are transformed into parsers that recognize the same language and construct the same Abstract Syntax Trees (ASTs) for each input. In this ...
    • Automatic safe data reuse detection for the WCET analysis of systems with data caches 

      Segarra Flor, Juan; Cortadella, Jordi; Gran Tejero, Rubén; Viñals Yúfera, Victor (Institute of Electrical and Electronics Engineers (IEEE), 2020-10-19)
      Article
      Open Access
      Worst-case execution time (WCET) analysis of systems with data caches is one of the key challenges in real-time systems. Caches exploit the inherent reuse properties of programs, temporarily storing certain memory contents ...
    • Automatic translation of programs for evaluation of execution times 

      Martín Brualla, Ricardo (Universitat Politècnica de Catalunya, 2011-12-23)
      Master thesis (pre-Bologna period)
      Open Access
      Castellano: Este proyecto persigue la traduccion automática de programas en un subconjunto de C++ a otros lenguajes de programacion para así poder estimar mejor los límites de tiempo en jueces en línea.
    • Benchmarking of state-of-the-art HPC clusters with a production CFD code 

      Banchelli Gracia, Fabio; Garcia Gasulla, Marta; Houzeaux, Guillaume; Mantovani, Filippo (Association for Computing Machinery (ACM), 2020)
      Conference report
      Open Access
      Computing technologies populating high-performance computing (HPC) clusters are getting more and more diverse, offering a wide range of architectural features. As a consequence, efficient programming of such platforms ...
    • CellMT: A cooperative multithreading library for the Cell/B.E. 

      Beltran Querol, Vicenç; Carrera Pérez, David; Torres Viñals, Jordi; Ayguadé Parra, Eduard (IEEE Computer Society Publications, 2009-12-16)
      Conference report
      Open Access
      The Cell BE processor has proved that heterogeneous multi-core systems can provide a huge computational power with high efficiency for a wide range of applications. The simple design of the computational units and the use ...
    • Code layout optimizations for transaction processing workloads 

      Ramírez Bellido, Alejandro; Barroso, Luiz A; Gharachorloo, Kourosh; Cohn, Robert; Larriba Pey, Josep; Lowney, P. Geoffrey; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2001)
      Conference report
      Open Access
      Commercial applications such as databases and Web servers constitute the most important market segment for high-performance servers. Among these applications, on-line transaction processing (OLTP) workloads provide a ...
    • Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures 

      Álvarez Martí, Lluc; Vilanova, Lluís; Moreto Planas, Miquel; Casas, Marc; González Tallada, Marc; Martorell Bofill, Xavier; Navarro, Nacho; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2015)
      Conference report
      Open Access
      The increasing number of cores in manycore architectures causes important power and scalability problems in the memory subsystem. One solution is to introduce scratchpad memories alongside the cache hierarchy, forming a ...
    • Compilación C a VHDL de códigos de bucles con reuso de datos 

      Sánchez Fernández, Raúl (Universitat Politècnica de Catalunya, 2010-03-25)
      Master thesis (pre-Bologna period)
      Open Access
      Durante este proyecto se ha desarrollado un compilador fuente a fuente, de nombre CtoVHDL, capaz de traducir bucles de C a VHDL. Con esta traducción se crea un acelerador hardware capaz de ejecutar el bucle en una FPGA. ...
    • Compiler Analysis and its application to OmpSs 

      Royuela Alcázar, Sara (Universitat Politècnica de Catalunya, 2012-01-10)
      Master thesis
      Open Access
      Nowadays, productivity is the buzzword in any computer science area. Several metrics have been defined in order to measure the productivity in any type of system. Some of the most important are the performance, the ...