Now showing items 1-4 of 4

  • ALOJA: A framework for benchmarking and predictive analytics in Hadoop deployments 

    Berral García, Josep Lluís; Poggi Mastrokalo, Nicolas; Carrera Pérez, David; Call, Aaron; Reinauer, Rob; Green, Daron (Institute of Electrical and Electronics Engineers (IEEE), 2015-10)
    Article
    Open Access
    This article presents the ALOJA project and its analytics tools, which leverages machine learning to interpret Big Data benchmark performance data and tuning. ALOJA is part of a long-term collaboration between BSC and ...
  • ALOJA: a systematic study of Hadoop deployment variables to enable automated characterization of cost-effectiveness 

    Poggi Mastrokalo, Nicolas; Carrera Pérez, David; Call, Aaron; Mendoza, Sergio; Becerra Fontal, Yolanda; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Gagliardi, Fabrizio; Labarta Mancho, Jesús José; Reinauer, Rob; Vujic, Nikola; Green, Daron; Blakeley, Jose (Institute of Electrical and Electronics Engineers (IEEE), 2014)
    Conference report
    Open Access
    This article presents the ALOJA project, an initiative to produce mechanisms for an automated characterization of cost-effectiveness of Hadoop deployments and reports its initial results. ALOJA is the latest phase of a ...
  • ALOJA-ML: a framework for automating characterization and knowledge discovery in Hadoop deployments 

    Berral García, Josep Lluís; Poggi, Nicolas; Carrera Pérez, David; Call, Aaron; Reinauer, Rob; Green, Daron (Association for Computing Machinery (ACM), 2015)
    Conference report
    Open Access
    This article presents ALOJA-Machine Learning (ALOJA-ML) an extension to the ALOJA project that uses machine learning techniques to interpret Hadoop benchmark performance data and performance tuning; here we detail the ...
  • Disaggregating Non-Volatile Memory for Throughput-Oriented Genomics Workloads 

    Call, Aaron; Polo, Jordà; Carrera, David; Guim, Francesc; Sen, Sujoy (Springer, 2018-12)
    Conference lecture
    Open Access
    Massive exploitation of next-generation sequencing technologies requires dealing with both: huge amounts of data and complex bioinformatics pipelines. Computing architectures have evolved to deal with these problems, ...