Now showing items 1-6 of 6

  • ALOJA: A benchmarking and predictive platform for big data performance analysis 

    Poggi, Nicolas; Berral García, Josep Lluís; Carrera Pérez, David (Springer, 2016)
    Conference report
    Open Access
    The main goals of the ALOJA research project from BSC-MSR, are to explore and automate the characterization of cost-effectivenessof Big Data deployments. The development of the project over its first year, has resulted in ...
  • ALOJA-ML: a framework for automating characterization and knowledge discovery in Hadoop deployments 

    Berral García, Josep Lluís; Poggi, Nicolas; Carrera Pérez, David; Call, Aaron; Reinauer, Rob; Green, Daron (Association for Computing Machinery (ACM), 2015)
    Conference report
    Open Access
    This article presents ALOJA-Machine Learning (ALOJA-ML) an extension to the ALOJA project that uses machine learning techniques to interpret Hadoop benchmark performance data and performance tuning; here we detail the ...
  • Characterizing BigBench Queries, Hive, and Spark in Multi-cloud Environments 

    Poggi, Nicolas; Montero, Alejandro; Carrera, David (Springer Verlag, 2017-12-30)
    Conference lecture
    Open Access
    BigBench is the new standard (TPCx-BB) for benchmarking and testing Big Data systems. The TPCx-BB specification describes several business use cases—queries—which require a broad combination of data extraction techniques ...
  • Database integrated analytics using R : initial experiences with SQL-Server + R 

    Berral, Josep Ll.; Poggi, Nicolas (2016)
    Conference lecture
    Open Access
    Most data scientists use nowadays functional or semi-functional languages like SQL, Scala or R to treat data, obtained directly from databases. Such process requires to fetch data, process it, then store again, and such ...
  • Database Integrated Analytics Using R: Initial Experiences with SQL-Server + R 

    Berrall, Josep Ll.; Poggi, Nicolas (Institute of Electrical and Electronics Engineers (IEEE), 2017-02-02)
    Conference lecture
    Open Access
    Most data scientists use nowadays functional or semi-functional languages like SQL, Scala or R to treat data, obtained directly from databases. Such process requires to fetch data, process it, then store again, and such ...
  • The state of SQL-on-Hadoop in the cloud 

    Poggi, Nicolas; Berral García, Josep Lluís; Fenech, Thomas; Carrera Pérez, David; Blakeley, Jose; Minhas, Umar F.; Vujic, Nikola (Institute of Electrical and Electronics Engineers (IEEE), 2016)
    Conference report
    Open Access
    Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. ...