Mostra el registre d'ítem simple

dc.contributor.authorBuchaca Prats, David
dc.contributor.authorBerral García, Josep Lluís
dc.contributor.authorWang, Chen
dc.contributor.authorYoussef, Alaa
dc.contributor.otherUniversitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
dc.contributor.otherBarcelona Supercomputing Center
dc.date.accessioned2021-02-18T10:39:33Z
dc.date.available2021-02-18T10:39:33Z
dc.date.issued2020
dc.identifier.citationBuchaca, D. [et al.]. Proactive container auto-scaling for cloud native machine learning services. A: IEEE International Conference on Cloud Computing. "2020 IEEE 13th International Conference on Cloud Computing: 18–24 October 2020, virtual event: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2020, p. 475-479. ISBN 978-1-7281-8780-8. DOI 10.1109/CLOUD49709.2020.00070.
dc.identifier.isbn978-1-7281-8780-8
dc.identifier.urihttp://hdl.handle.net/2117/340053
dc.description.abstractUnderstanding the resource usage behaviors of the ever-increasing machine learning workloads are critical to cloud providers offering Machine Learning (ML) services. Capable of auto-scaling resources for customer workloads can significantly improve resource utilization, thus greatly reducing the cost. Here we leverage the AI4DL framework [1] to characterize workload and discover resource consumption phases. We advance the existing technology to an incremental phase discovery method that applies to more general types of ML workload for both training and inference. We use a time-window MultiLayer Perceptron (MLP) to predict phases in containers with different types of workload. Then, we propose a predictive vertical auto-scaling policy to resize the container dynamically according to phase predictions. We evaluate our predictive auto-scaling policies on 561 long-running containers with multiple types of ML workloads. The predictive policy can reduce up to 38% of allocated CPU compared to the default resource provisioning policies by developers. By comparing our predictive policies with commonly used reactive auto-scaling policies, we find that they can accurately predict sudden phase transitions (with an F1-score of 0.92) and significantly reduce the number of out-of-memory errors (350 vs. 20). Besides, we show that the predictive auto-scaling policy maintains the number of resizing operations close to the best reactive policies.
dc.format.extent5 p.
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.subjectÀrees temàtiques de la UPC::Informàtica::Arquitectura de computadors
dc.subject.lcshCloud computing
dc.subject.lcshMachine learning
dc.subject.lcshResource allocation
dc.subject.otherCloud native
dc.subject.otherMachine learning service
dc.subject.otherContainer
dc.subject.otherAuto-scaling
dc.titleProactive container auto-scaling for cloud native machine learning services
dc.typeConference lecture
dc.subject.lemacComputació en núvol
dc.subject.lemacAprenentatge automàtic
dc.subject.lemacAssignació de recursos
dc.contributor.groupUniversitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
dc.identifier.doi10.1109/CLOUD49709.2020.00070
dc.description.peerreviewedPeer Reviewed
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/9284206
dc.rights.accessOpen Access
local.identifier.drac30574278
dc.description.versionPostprint (author's final draft)
local.citation.authorBuchaca, D.; Berral, J.; Wang, C.; Youssef, A.
local.citation.contributorIEEE International Conference on Cloud Computing
local.citation.publicationName2020 IEEE 13th International Conference on Cloud Computing: 18–24 October 2020, virtual event: proceedings
local.citation.startingPage475
local.citation.endingPage479


Fitxers d'aquest items

Thumbnail

Aquest ítem apareix a les col·leccions següents

Mostra el registre d'ítem simple