Recent Submissions

  • MPI+OpenMP tasking scalability for multi-morphology simulations of the human brain 

    Valero-Lara, Pedro; Sirvent, Raül; Peña, Antonio J.; Labarta Mancho, Jesús José (2019-05)
    Article
    Restricted access - publisher's policy
    The simulation of the behavior of the human brain is one of the most ambitious challenges today with a non-end of important applications. We can find many different initiatives in the USA, Europe and Japan which attempt ...
  • A hardware runtime for task-based programming models 

    Tan, Xubin; Bosch, Jaume; Álvarez, Carlos; Jiménez González, Daniel; Ayguadé Parra, Eduard; Valero Cortés, Mateo (2019-09-01)
    Article
    Open Access
    Task-based programming models such as OpenMP 5.0 and OmpSs are simple to use and powerful enough to exploit task parallelism of applications over multicore, manycore and heterogeneous systems. However, their software-only ...
  • Distributed training of deep neural networks with spark: The MareNostrum experience 

    Cruz, Leonel; Tous Liesa, Rubén; Otero Calviño, Beatriz (Elsevier, 2019-07-01)
    Article
    Restricted access - publisher's policy
    Deployment of a distributed deep learning technology stack on a large parallel system is a very complex process, involving the integration and configuration of several layers of both, general-purpose and custom software. ...
  • Using Arm’s scalable vector extension on stencil codes 

    Armejach Sanosa, Adrià; Caminal Pallarés, Helena; Cebrián González, Juan Manuel; Langarita, Rubén; González-Alberquilla, Rekai; Adeniyi-Jones, Chris; Valero Cortés, Mateo; Casas Guix, Marc; Moreto Planas, Miquel (2019-04-08)
    Article
    Restricted access - publisher's policy
    Data-level parallelism is frequently ignored or underutilized. Achieved through vector/SIMD capabilities, it can provide substantial performance improvements on top of widely used techniques such as thread-level parallelism. ...
  • Task Packing: Efficient task scheduling in unbalanced parallel programs to maximize CPU utilization 

    Utrera Iglesias, Gladys Miriam; Farreras Esclusa, Montse; Fornés de Juan, Jordi (Elsevier, 2019-12)
    Article
    Restricted access - publisher's policy
    Load imbalance in parallel systems can be generated by external factors to the currently running applications like operating system noise or the underlying hardware like a heterogeneous cluster. HPC applications working ...
  • Artificial neural networks as emerging tools for earthquake detection 

    Rojas, Otilio; Otero Calviño, Beatriz; Alvarado, Leonardo; Mus, Sergi; Tous Liesa, Rubén (2019)
    Article
    Open Access
    As seismic networks continue to spread and monitoring sensors become more ef¿cient, the abundance of data highly surpasses the processing capabilities of earthquake interpretation analysts. Earthquake catalogs are fundamental ...
  • PROFET: modeling system performance and energy without simulating the CPU 

    Radulovic, Milan; Sánchez-Verdejo, Rommel; Carpenter, Paul Matthew; Radojkovic, Petar; Jacob, Bruce; Ayguadé Parra, Eduard (2019-06)
    Article
    Open Access
    The approaching end of DRAM scaling and expansion of emerging memory technologies is motivating a lot of research in future memory systems. Novel memory systems are typically explored by hardware simulators that are slow ...
  • Increasing the number of strides for conflict-free vector access 

    Valero Cortés, Mateo; Lang, Tomas; Llaberia Griñó, José M.; Peiron Guàrdia, Montse; Ayguadé Parra, Eduard; Navarro Guerrero, Juan José (1992-05)
    Article
    Open Access
    Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free vector access for some strides in vector processors with multi-module memories. In this paper, we ...
  • Conflict-free strides for vectors in matched memories 

    Valero Cortés, Mateo; Lang, Tomas; Llaberia Griñó, José M.; Peiron Guàrdia, Montse; Navarro Guerrero, Juan José; Ayguadé Parra, Eduard (1991-12)
    Article
    Open Access
    Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free access to one family of strides in vector processors with matched memories. The paper extends these ...
  • Studying the impact of the Full-Network embedding on multimodal pipelines 

    Vilalta, Armand; Garcia-Gasulla, Dario; Pares, Ferran; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Moya-Sánchez, Ulises; Cortés García, Claudio Ulises (IOS Press, 2018)
    Article
    Open Access
    The current state of the art for image annotation and image retrieval tasks is obtained through deep neural network multimodal pipelines, which combine an image representation and a text representation into a shared embedding ...
  • Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD 

    Rodríguez Sánchez, Rafael; Catalán Pallarés, Sandra; Herrero Zaragoza, José Ramón; Quintana Ortí, Enrique Salvador; Tomás Domínguez, Andrés Enrique (2019-02-01)
    Article
    Open Access
    We address the reduction to compact band forms, via unitary similaritytransformations, for the solution of symmetric eigenvalue problems and the compu-tation of the singular value decomposition (SVD). Concretely, in the ...
  • Two-sided orthogonal reductions to condensed forms on asymmetric multicore processors 

    Alonso Jordá, Pedro; Catalán Pallarés, Sandra; Herrero Zaragoza, José Ramón; Quintana Ortí, Enrique Salvador; Rodríguez Sánchez, Rafael (2018-10)
    Article
    Restricted access - publisher's policy
    We investigate how to leverage the heterogeneous resources of an Asymmetric Multicore Processor (AMP) in order to deliver high performance in the reduction to condensed forms for the solution of dense eigenvalue and ...

View more