Now showing items 21-40 of 260

  • Analysis of the overheads incurred due to speculation in a task based programming model 

    Gayatri, Rahulkumar; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard (2015)
    Conference report
    Open Access
    In order to efficiently utilize the ever increasing processing power of multi-cores, a programmer must extract as much parallelism as possible from a given application. However with every such attempt there is an associated ...
  • Analyzing performance improvements and energy savings in Infiniband architecture using network compression 

    Dickov, Branimir; Pericas, Miquel; Carpenter, Paul; Navarro, Nacho; Ayguadé Parra, Eduard (Institute of Electrical and Electronics Engineers (IEEE), 2014)
    Conference report
    Restricted access - publisher's policy
    One of the greatest challenges in HPC is total system power and energy consumption. Whereas HPC interconnects have traditionally been designed with a focus on bandwidth and latency, there is an increasing interest in ...
  • An approach to task-based parallel programming for undergraduate students 

    Ayguadé Parra, Eduard; Jiménez González, Daniel (2018-03-07)
    Article
    Open Access
    This paper presents the description of a compulsory parallel programming course in the bachelor degree in Informatics Engineering at the Barcelona School of Informatics, Universitat Politècnica de Catalunya UPC-BarcelonaTech. ...
  • Another trip to the wall: how much will stacked DRAM benefit HPC? 

    Radulovic, Milan; Zivanovic, Darko; Ruiz, Daniel; De Supinski, Bronis; McKee, Sally; Radojkovic, Petar; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2015)
    Conference report
    Restricted access - publisher's policy
    First defined two decades ago, the memory wall remains a fundamental limitation to system performance. Recent innovations in 3D-stacking technology enable DRAM devices with much higher bandwidths than traditional DIMMs. ...
  • An out-of-the-box full-network embedding for convolutional neural networks 

    Garcia-Gasulla, Dario; Vilalta, Armand; Parés, Ferran; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Cortés García, Claudio Ulises; Suzumura, Toyotaro (Institute of Electrical and Electronics Engineers (IEEE), 2018)
    Conference report
    Open Access
    Features extracted through transfer learning can be used to exploit deep learning representations in contexts where there are very few training samples, where there are limited computational resources, or when the tuning ...
  • A novel asynchronous software cache implementation for the Cell-BE processor 

    Balart, J; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Sura, Z; Chen, T; Zhang, T; O'Brien, Kevin; O'Brien, Kathryn (2008-10)
    Article
    Restricted access - publisher's policy
    This paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several services that allow the compiler to generate ...
  • Application Acceleration on FPGAs with OmpSs@FPGA 

    Bosch, Jaume; Tan, Xubin; Filgueras Izquierdo, Antonio; Vidal, Miquel; Mateu, Marc; Jiménez-González, Daniel; Álvarez, Carlos; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2019)
    Conference report
    Open Access
    OmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous ...
  • A proposal for error handling in OpenMP 

    Duran González, Alejandro; Ferrer, Roger; Costa Prats, Juan José; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José (2006-06)
    Article
    Restricted access - publisher's policy
    OpenMP has been focused in performance applied to numerical applications, but when we try to move this focus to other kind of applications, like Web servers, we detect one important lack. In these applications, performance ...
  • A proposal for task parallelism in OpenMP 

    Ayguadé Parra, Eduard; Nawal, Copty; Duran González, Alejandro; Hoeflinger, Jay; Yuan, Lin; Massaioli, Federico; Ernesto, Su; Unnikrishnan, Priya; Guansong, Zhang (2007-06)
    Article
    Restricted access - publisher's policy
    This paper presents a novel proposal to define task parallelism in OpenMP. Task parallelism has been lacking in the OpenMP language for a number of years already. As we show, this makes certain kinds of applications difficult ...
  • A proposal to extend the OpenMP tasking model with dependent tasks 

    Duran Gonzalez, Alejandro; Ferrer, Roger; Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Labarta Mancho, Jesús José (2009-06)
    Article
    Restricted access - publisher's policy
    Tasking in OpenMP 3.0 has been conceived to handle the dynamic generation of unstructured parallelism. New directives have been added allowing the user to identify units of independent work (tasks) and to define points ...
  • Assembling a high-productivity DSL for computational fluid dynamics 

    Macià, Sandra; Martínez-Ferrer, Pedro J.; Mateo, Sergi; Beltran Querol, Vicenç; Ayguadé Parra, Eduard (Association for Computing Machinery (ACM), 2019)
    Conference report
    Open Access
    As we move towards exascale computing, an abstraction for effective parallel computation is increasingly needed to overcome the maintainability and portability of scientific applications while ensuring the efficient and ...
  • A streaming machine description and programming model 

    Carpenter, Paul; Ródenas Picó, David; Martorell Bofill, Xavier; Ramírez Bellido, Alejandro; Ayguadé Parra, Eduard (2007-07)
    Article
    Restricted access - publisher's policy
    In this paper we present the initial development of a streaming environment based on a programming model and machine description. The stream programming model consists of an extension to the C language and it’s translation ...
  • A Survey on Performance Management for Internet Applications 

    Guitart Fernández, Jordi; Torres Viñals, Jordi; Ayguadé Parra, Eduard (2010-01-01)
    Article
    Restricted access - publisher's policy
    Internet applications have become indispensable for many business and personal processes, turning the performance of these applications into a key issue. For this reason, recent research has comprehensively explored ...
  • Asynchronous and exact forward recovery for detected errors in iterative solvers 

    Jaulmes, Luc Etienne; Casas, Marc; Moreto Planas, Miquel; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2018-03-19)
    Article
    Open Access
    Current trends and projections show that faults in computer systems become increasingly common. Such errors may be detected, and possibly corrected transparently, e.g. by Error Correcting Codes (ECC). For a program to be ...
  • A template system for the efficient compilation of domain abstractions onto reconfigurable computers 

    Shafiq, Muhammad; Pericàs Gleim, Miquel; Ayguadé Parra, Eduard (2011)
    Conference report
    Restricted access - publisher's policy
    Past research has addressed the issue of using FPGAs as accelerators for HPC systems. However, writing low level code for an efficient, portable and scalable architecture altogether has been always a ...
  • Atomic quake: using transactional memory in an interactive mulitplayer game Server 

    Zyulkyarov, Ferad; Gajinov, Vladimir; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Ayguadé Parra, Eduard; Harris, Tim; Valero Cortés, Mateo (2009)
    Conference report
    Open Access
    Transactional Memory (TM) is being studied widely as a new technique for synchronizing concurrent accesses to shared memory data structures for use in multi-core systems. Much of the initial work on TM has been evaluated ...
  • A transparent runtime data distribution engine for OpenMP 

    Nikolopoulos, Dimitrios; Papatheodorou, Theodore; Polychronopoulos, C D; Labarta Mancho, Jesús José; Ayguadé Parra, Eduard (2001-07)
    Article
    Restricted access - publisher's policy
    This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the ...
  • Automated curation of brand-related social media images with deep learning 

    Tous Liesa, Rubén; Gómez Parada, Mauro; Poveda, Jonatan; Cruz, Leonel; Wust, Otto; Makni, Mouna; Ayguadé Parra, Eduard (2018-10)
    Article
    Open Access
    This paper presents a work consisting in using deep convolutional neural networks (CNNs) to facilitate the curation of brand-related social media images. The final goal is to facilitate searching and discovering user-generated ...
  • Automatic exploration of potential parallelism in sequential applications 

    Subotic, Vladimir; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (Springer, 2014)
    Conference report
    Restricted access - publisher's policy
    The multicore era has increased the need for highly parallel software. Since automatic parallelization turned out ineffective for many production codes, the community hopes for the development of tools that may assist ...
  • Automatic multilevel parallelization using OpenMP 

    Jin, H; Jost, G; Yan, J; Ayguadé Parra, Eduard; González Tallada, Marc; Martorell Bofill, Xavier (2004-06)
    Article
    Restricted access - publisher's policy
    In this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler ...