Now showing items 1-4 of 4

    • BLAS-3 optimized by OmpSs regions (LASs library) 

      Valero Lara, Pedro; Catalán Pallarés, Sandra; Martorell Bofill, Xavier; Labarta Mancho, Jesús José (Institute of Electrical and Electronics Engineers (IEEE), 2019)
      Conference report
      Open Access
      In this paper we propose a set of optimizations for the BLAS-3 routines of LASs library (Linear Algebra routines on OmpSs) and perform a detailed analysis of the impact of the proposed changes in terms of performance and ...
    • cuThomasBatch and cuThomasVBatch, CUDA Routines to compute batch of tridiagonal systems on NVIDIA GPUs 

      Valero Lara, Pedro; Martinez Pérez, Ivan; Sirvent, Raül; Martorell Bofill, Xavier; Peña, Antonio J. (Wiley, 2018-01-01)
      Article
      Open Access
      The solving of tridiagonal systems is one of the most computationally expensive parts in many applications, so that multiple studies have explored the use of NVIDIA GPUs to accelerate such computation. However, these studies ...
    • sLASs: a fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library) 

      Valero Lara, Pedro; Catalán Pallarés, Sandra; Martorell Bofill, Xavier; Usui, Tetsuzo; Labarta Mancho, Jesús José (Elsevier, 2020-04-01)
      Article
      Restricted access - publisher's policy
      In this work we have implemented a novel Linear Algebra Library on top of the task-based runtime OmpSs-2. We have used some of the most advanced OmpSs-2 features; weak dependencies and regions, together with the final ...
    • Towards an auto-tuned and task-based SpMV (LASs Library) 

      Catalán Pallarés, Sandra; Usui, Tetsuzo; Toledo, Leonel; Martorell Bofill, Xavier; Labarta Mancho, Jesús José; Valero Lara, Pedro (Springer, 2020)
      Conference report
      Open Access
      We present a novel approach to parallelize the SpMV kernel included in LASs (Linear Algebra routines on OmpSs) library, after a deep review and analysis of several well-known approaches. LASs is based on OmpSs, a task-based ...