Recent Submissions

  • Distributed training of deep neural networks with spark: The MareNostrum experience 

    Cruz, Leonel; Tous Liesa, Rubén; Otero Calviño, Beatriz (Elsevier, 2019-07-01)
    Article
    Restricted access - publisher's policy
    Deployment of a distributed deep learning technology stack on a large parallel system is a very complex process, involving the integration and configuration of several layers of both, general-purpose and custom software. ...
  • Using Arm’s scalable vector extension on stencil codes 

    Armejach Sanosa, Adrià; Caminal Pallarés, Helena; Cebrián González, Juan Manuel; Langarita, Rubén; González-Alberquilla, Rekai; Adeniyi-Jones, Chris; Valero Cortés, Mateo; Casas Guix, Marc; Moreto Planas, Miquel (2019-04-08)
    Article
    Restricted access - publisher's policy
    Data-level parallelism is frequently ignored or underutilized. Achieved through vector/SIMD capabilities, it can provide substantial performance improvements on top of widely used techniques such as thread-level parallelism. ...
  • Task Packing: Efficient task scheduling in unbalanced parallel programs to maximize CPU utilization 

    Utrera Iglesias, Gladys Miriam; Farreras Esclusa, Montse; Fornés de Juan, Jordi (Elsevier, 2019-12)
    Article
    Restricted access - publisher's policy
    Load imbalance in parallel systems can be generated by external factors to the currently running applications like operating system noise or the underlying hardware like a heterogeneous cluster. HPC applications working ...
  • Artificial neural networks as emerging tools for earthquake detection 

    Rojas, Otilio; Otero Calviño, Beatriz; Alvarado, Leonardo; Mus, Sergi; Tous Liesa, Rubén (2019)
    Article
    Open Access
    As seismic networks continue to spread and monitoring sensors become more ef¿cient, the abundance of data highly surpasses the processing capabilities of earthquake interpretation analysts. Earthquake catalogs are fundamental ...
  • PROFET: modeling system performance and energy without simulating the CPU 

    Radulovic, Milan; Sánchez-Verdejo, Rommel; Carpenter, Paul Matthew; Radojkovic, Petar; Jacob, Bruce; Ayguadé Parra, Eduard (2019-06)
    Article
    Open Access
    The approaching end of DRAM scaling and expansion of emerging memory technologies is motivating a lot of research in future memory systems. Novel memory systems are typically explored by hardware simulators that are slow ...
  • Increasing the number of strides for conflict-free vector access 

    Valero Cortés, Mateo; Lang, Tomas; Llaberia Griñó, José M.; Peiron Guàrdia, Montse; Ayguadé Parra, Eduard; Navarro Guerrero, Juan José (1992-05)
    Article
    Open Access
    Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free vector access for some strides in vector processors with multi-module memories. In this paper, we ...
  • Conflict-free strides for vectors in matched memories 

    Valero Cortés, Mateo; Lang, Tomas; Llaberia Griñó, José M.; Peiron Guàrdia, Montse; Navarro Guerrero, Juan José; Ayguadé Parra, Eduard (1991-12)
    Article
    Open Access
    Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free access to one family of strides in vector processors with matched memories. The paper extends these ...
  • Studying the impact of the Full-Network embedding on multimodal pipelines 

    Vilalta, Armand; Garcia-Gasulla, Dario; Pares, Ferran; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Moya-Sánchez, Ulises; Cortés García, Claudio Ulises (IOS Press, 2018)
    Article
    Open Access
    The current state of the art for image annotation and image retrieval tasks is obtained through deep neural network multimodal pipelines, which combine an image representation and a text representation into a shared embedding ...
  • Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD 

    Rodríguez Sánchez, Rafael; Catalán Pallarés, Sandra; Herrero Zaragoza, José Ramón; Quintana Ortí, Enrique Salvador; Tomás Domínguez, Andrés Enrique (2019-02-01)
    Article
    Open Access
    We address the reduction to compact band forms, via unitary similaritytransformations, for the solution of symmetric eigenvalue problems and the compu-tation of the singular value decomposition (SVD). Concretely, in the ...
  • Two-sided orthogonal reductions to condensed forms on asymmetric multicore processors 

    Alonso Jordá, Pedro; Catalán Pallarés, Sandra; Herrero Zaragoza, José Ramón; Quintana Ortí, Enrique Salvador; Rodríguez Sánchez, Rafael (2018-10)
    Article
    Restricted access - publisher's policy
    We investigate how to leverage the heterogeneous resources of an Asymmetric Multicore Processor (AMP) in order to deliver high performance in the reduction to condensed forms for the solution of dense eigenvalue and ...
  • Static scheduling of the LU factorization with look-ahead on asymmetric multicore processors 

    Catalán Pallarés, Sandra; Herrero Zaragoza, José Ramón; Quintana Ortí, Enrique Salvador; Rodríguez Sánchez, Rafael (2018-08)
    Article
    Restricted access - publisher's policy
    We analyze the benefits of look-ahead in the parallel execution of the LU factorization with partial pivoting (LUpp) in two distinct “asymmetric” multicore scenarios. The first one corresponds to an actual hardware-asymmetric ...
  • On the maturity of parallel applications for asymmetric multi-core processors 

    Chronaki, Kallia; Moreto Planas, Miquel; Casas, Marc; Rico, Alejandro; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Elsevier, 2019-05-01)
    Article
    Restricted access - publisher's policy
    Asymmetric multi-cores (AMCs) are a successful architectural solution for both mobile devices and supercomputers. By maintaining two types of cores (fast and slow) AMCs are able to provide high performance under the facility ...

View more