Mostra el registre d'ítem simple

dc.contributor.authorBuchaca Prats, David
dc.contributor.authorAlbuquerque Portella, Felipe
dc.contributor.authorCosta, Carlos H. A.
dc.contributor.authorBerral García, Josep Lluís
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
dc.contributor.otherBarcelona Supercomputing Center
dc.date.accessioned2021-02-22T10:25:29Z
dc.date.available2021-02-22T10:25:29Z
dc.date.issued2020-12
dc.identifier.citationBuchaca, D. [et al.]. You only run once: Spark auto-tuning from a single run. "IEEE transactions on network and service management", Desembre 2020, vol. 17, núm. 4, p. 2039-2051.
dc.identifier.issn1932-4537
dc.identifier.urihttp://hdl.handle.net/2117/340271
dc.description.abstractTuning configurations of Spark jobs is not a trivial task. State-of-the-art auto-tuning systems are based on iteratively running workloads with different configurations. During the optimization process, the relevant features are explored to find good solutions. Many optimizers enhance the time-to-solution using black-box optimization algorithms that do not take into account any information from the Spark workloads. In this article, we present a new method for tuning configurations that uses information from one run of a Spark workload. To achieve good performance, we mine the SparkEventLog that is generated by the Spark engine. This log file contains a large amount of information from the executed application. We use this information to enhance a performance model with low-level features from the workload to be optimized. These features include Spark Actions, Transformations, and Task metrics. This process allows us to obtain application-specific workload information. With this information our system can predict sensible Spark configurations for unseen jobs, given that it has been trained with reasonable coverage of Spark applications. Experiments show that the presented system correctly produces good configurations, while achieving up to 80% speedup with respect to the default Spark configuration, and up to 12x speedup of the time-to-solution with respect to a standard Bayesian Optimization procedure.
dc.description.sponsorshipThis work is supported by the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement no 639595); the Spanish Ministry of Economy, under contract TIN2015- 65316-P, and the Generalitat de Catalunya under contract 2014SGR1051; the ICREA Academia program; the BSC-CNS Severo Ochoa program (SEV-2015-0493); and by Petroleo Brasileiro S. A. (PETROBRAS).
dc.format.extent13 p.
dc.language.isoeng
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshBayesian statistical decision theory
dc.subject.lcshMachine learning
dc.subject.lcshParallel processing (Electronic computers)
dc.subject.otherDecision making for workload auto-tuning
dc.subject.otherSpark auto-tuning
dc.subject.otherWorkload modeling
dc.subject.otherWorkload placement
dc.titleYou only run once: Spark auto-tuning from a single run
dc.typeArticle
dc.subject.lemacEstadística bayesiana
dc.subject.lemacAprenentatge automàtic
dc.subject.lemacProcessament en paral·lel (Ordinadors)
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/TNSM.2020.3034824
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/9244226
dc.rights.accessOpen Access
local.identifier.drac30567433
dc.description.versionPostprint (author's final draft)
dc.relation.projectidinfo:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
dc.relation.projectidinfo:eu-repo/grantAgreement/AGAUR/V PRI/2014 SGR 1051
dc.relation.projectidinfo:eu-repo/grantAgreement/EC/H2020/639595/EU/Holistic Integration of Emerging Supercomputing Technologies/Hi-EST
local.citation.authorBuchaca, D.; Albuquerque, F.; Costa, C.; Berral, J.
local.citation.publicationNameIEEE transactions on network and service management
local.citation.volume17
local.citation.number4
local.citation.startingPage2039
local.citation.endingPage2051


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple